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PREFACE 


This is an introductory undergraduate textbook in set theory. In mathe- 
matics these days, essentially everything is a set. Some knowledge of set 
theory is a necessary part of the background everyone needs for further 
study of mathematics. It is also possible to study set theory for its own 
interest—it is a subject with intriguing results about simple objects. This 
book starts with material that nobody can do without. There is no end 
to what can be learned of set theory, but here is a beginning. 

The author of a book always has a preferred manner for using the 
book: A reader should simply study it from beginning to end. But in 
practice, the users of a book have their own goals. I have tried to build 
into the present book enough flexibility to accommodate a variety of goals. 

The axiomatic material in the text is marked by a stripe in the margin. 
The purpose of the stripe is to allow a user to deemphasize the axiomatic 
material, or even to omit it entirely. 

A course in axiomatic set theory might reasonably cover the first six or 
seven chapters, omitting Chapter 5. This is the amount of set theory that 
everyone with an interest in matters mathematical should know. Those with 
a special interest in set theory itself are encouraged to continue to the end 
of the book (and beyond). A very different sort of course might emphasize 


xi 
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the set-theoretic construction of the number systems. This course might 
cover the first five chapters, devoting only as much attention to the axiomatic 
material as desired. The book presupposes no specific background. It does 
give real proofs. The first difficult proof is not met until part way through 
Chapter 4. 

The hierarchical view of sets, constructed by transfinite iteration of the 
power set operation, is adopted from the start. The axiom of regularity is 
not added until it can be proved to be equivalent to the assertion that 
every set has a rank. 

The exercises are placed at the end of each (or nearly each) section. 
In addition, Chapters 2, 3, and 4 have “ Review Exercises” at the ends of the 
chapters. These are comparatively straightforward exercises for the reader 
wishing additional review of the material. There are, in all, close to 300 
exercises. 

There is a brief appendix dealing with some topics from logic, such as 
truth tables and quantifiers. This appendix also contains an example of how 
one might discover a proof. 

At the end of this text there is an annotated list of books recommended 
for further study. In fact it includes diverse books for several further studies 
ina variety of directions. Those wishing to track down the source of particular 
results or historical points are referred to the books on the list that provide 
specific citations. 

There are two stylistic matters that require mention. The end of a proof 
is marked by a reversed turnstile (4). This device is due to C. C. Chang 
and H. J. Keisler. In definitions, I generally pass up the traditionally correct 
“if? in favor of the logically correct “iff” (meaning “if and only if”). 

Two preliminary editions of the text have been used in my courses at 
UCLA. I would be pleased to receive comments and corrections from 
further users of the book. 
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CHAPTER 1 


INTRODUCTION 


BABY SET THEORY 


We shall begin with an informal discussion of some basic concepts of set 
theory. In these days of the “new math,” much of this material will be 
already familiar to you. Indeed, the practice of beginning each mathematics 
course with a discussion of set theory has become widespread, extending 
even to the elementary schools. But we want to review here elementary- 
school set theory (and do it in our notation). Along the way we shall be able to 
point out some matters that will become important later. We shall not, in 
these early sections, be particularly concerned with rigor. The more serious 
work will start in Chapter 2. 

A set is a collection of things (called its members or elements), the collection 
being regarded as a single object. We write “t € A” to say that t is a member 
of A, and we write “t ¢ A” to say that t is not a member of A. 

For example, there is the set whose members are exactly the prime 
numbers less than 10. This set has four elements, the numbers 2, 3, 5, and 7. 
We can name the set conveniently by listing the members within braces 
(curly brackets): 


(2, 3, 5, T} 
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Call this set A. And let B be the set of all solutions to the polynomial 
equation 


x* — 17x3 + 101x? — 247x + 210 = 0. 


Now it turns out (as the industrious reader can verify) that the set B has 
exactly the same four members, 2, 3, 5, and 7. For this reason A and B are 
the same set, i.e., A = B. It matters not that A and B were defined in 
different ways. Because they have exactly the same elements, they are 
equal; that is, they are one and the same set. We can formulate the general 
principle: 


Principle of Extensionality If two sets have exactly the same members, 
then they are equal. 


Here and elsewhere, we can state things more concisely and less 
ambiguously by utilizing a modest amount of symbolic notation. Also we 
abbreviate the phrase “if and only if” as “iff.” Thus we have the restatement: 


Principle of Extensionality If A and B are sets such that for every 
object t, 
teA iff teB, 
then A = B. 


For example, the set of primes less than 10 is the same as the set of 
solutions to the equation x* — 17x° + 101x? — 247x + 210 = 0. And the set 
{2} whose only member is the number 2 is the same as the set of even primes. 

Incidentally, we write “A = B” to mean that A and Bare the same object. 
That is, the expression “A” on the left of the equality symbol names the 
same object as does the expression “B” on the right. If A = B, then 
automatically (i.e., by logic) anything that is true of the object A is also true 
of the object B (it being the same object). For example, if A = B, then it is 
automatically true that for any object t, t e A iff t e B. (This is the converse 
to the principle of extensionality.) As usual, we write “A # B” to mean that 
it is not true that A = B. 

A small set would be a set {0} having only one member, the number 0. 
An even smaller set is the empty set Ø. The set @ has no members at all. 
Furthermore it is the only set with no members, since extensionality tells us 
that any two such sets must coincide. It might be thought at first that the 
empty set would be a rather useless or even frivolous set to mention, but, 
in fact, from the empty set by various set-theoretic operations a surprising - 
array of sets will be constructed. 

For any objects x and y, we can form the pair set {x, y} having just the 
members x and y. Observe that {x, y} = {y, x}, as both sets have exactly the 
same members. As a special case we have (when x = y) the set {x, x} = {x}. 
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For example, we can form the set {g} whose only member is @. Note 
that {Ø} + Ø, because Ø e {B} but Ø ¢ Ø. The fact that {DO} # is 
reflected in the fact that a man with an empty container is better off than a 
man with nothing—at least he has the container. Also we can form {Ø}, 
HØY), and so forth, all of which are distinct (Exercise 2). 

Similarly for any objects x, y, and z we can form the set {x, y, z}. More 
generally, we have the set {x,, ..., x,} whose members are exactly the 
objects x,,..., X,- For example, 


{Ds {D}, MOH} 


is a three-element set. 


(a) (b) 
Fig. 1. The shaded areas represent (a) A u B and (b) A ^ B. 


Two other familiar operations on sets are union and intersection. The 
union of sets A and B is the set A U B of all things that are members of 
Aor B (or both). Similarly the intersection of A and B is the set A  B of all 
things that are members of both A and B. For example, 


{x, y} U {z} = {x, y, 2) 
and 


(9, 3, 5, T} 0 {1, 2, 3, 4} = {2, 3}. 


Figure 1 gives the usual pictures illustrating these operations. Sets A and B 
are said to be disjoint when they have no common members, i.e., when 
AnB= Ø. 

A set A is said to be a subset of a set B (written A S B) iff all the 
members of A are also members of B. Note that any set is a subset of itself. 
At the other extreme, @ is a subset of every set. This fact (that Ø S A for 
any A) is “vacuously true,” since the task of verifying, for every member of 
Ø, that it also belongs to A, requires doing nothing at all. 

If A S B, then we also say that A is included in B or that B includes A. 
The inclusion relation (S) is not to be confused with the membership 
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relation (e). If we want to know whether A € B, we look at the set A as a 
single object, and we check to see if this single object is among the members 
of B. By contrast, if we want to know whether A S B, then we must open up 
the set A, examine its various members, and check whether its various 


members can be found among the members of B. 


Examples 1. Ø E Ø, but Ø ¢ Ø. 

2. {Bye {Dy} but 10) F AN {Ø} is not a subset of {Ø}; because 
there is a member of {@}, namely @, that is not a member of {Oh}. 

3. Let Us be the set of all people in the United States, and let Un be 
the set of all countries belonging to the United Nations. Then 


John Jones e Us e Un. 
But John Jones ¢ Un (since he is not even a country), and hence Us € Un. 


Any set A will have one or more subsets. (In fact, if A has n elements, 
then A has 2” subsets. But this is a matter we will take up much later.) We 
can gather all of the subsets of A into one collection. We then have the set 
of all subsets of A, called the power’ set PA of A. For example, 


PD ={P}, 
AD} = D. {DO}, 
AO, 1} = {Ø {0}, {1}, {0, 1}. 

A very flexible way of naming a set isthe method of abstraction. In this 
method we specify a set by giving the condition—the entrance requirement— 
that an object must satisfy in order to belong to the set. In this way we 
obtain the set of all objects x such that x meets the entrance requirement. 
The notation used for the set of all objects x such that the condition 
— x _ holds is 

{x |__x _}. 
For example: 

1. PA is the set of all objects x such that x is a subset of A. Here “x 
is a subset of A” is the entrance requirement that x must satisfy in order to 
belong to AA. We can write 

PA = {x | x is a subset of A} 
= {x |x & A}. 


1 The reasons for using the word “power” in this context are not very convincing, but the 
usage is now well established. 
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2. A Bis the set of all objects y such that y € A and ye B. We can 
write 


An B={y|ye Aand ye B}. 


66? 


It is unimportant whether we use “x” or “y” or another letter as the 
symbol (which is used as a pronoun) here. 

3. The set {z |z # z} equals @, because the entrance requirement 
“z 4 z” is not satisfied by any object z. 

4. The set {n | n is an even prime number} is the same as the set {2}. 


There are, however, some dangers inherent in the abstraction method. 
For certain bizarre choices of the entrance requirement, it may happen that 
there is no set containing exactly those objects meeting the entrance 
requirement. There are two ways in which disaster can strike. 

One of the potential disasters is illustrated by 


{x | x is a positive integer definable in one line of type}. 


The tricky word here is “definable.” Some numbers are easy to define in one 
line. For example, the following lines each serve to define a positive integer: 


12,317, 

the millionth prime number, 

the least number of the form 27" + 1 that is not prime, 
the 23rd perfect? number. 


Observe that there are only finitely many possible lines of type (because 
there are only finitely many symbols available to the printer, and there is a 
limit to how many symbols will fit on a line). Consequently 


{x | x is a positive integer definable in one line of type} 


is only a finite set of integers. Consider the least positive integer not in this 
set; that is, consider 


the least positive integer not definable in one line of type. 


The preceding line defines a positive integer in one line, but that number is, 
by its construction, not definable in one line! So we are in trouble, and the 
trouble can be blamed on the entrance requirement of the set, i.e., on the 
phrase “is a positive integer definable in one line of type.” While it may have 


2A positive integer is perfect if it equals the sum of its smaller divisors, e.g., 
6=1 +2 +43. It is deficient (or abundant) if the sum of its smaller divisors is less than (or 
greater than, respectively) the number itself. This terminology is a vestigial trace of 
numerology, the study of the mystical significance of numbers. The first four perfect numbers 
are 6, 28, 496, and 8128. 
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appeared originally to be a meaningful entrance requirement, it now appears 
to be gravely defective. (This example was given by G. G. Berry in 1906. 
A related example was published in 1905 by Jules Richard.) 

There is a second disaster that can result from an overly free-swinging 
use of the abstraction method. It is exemplified by 


{x | x € x}, 


this is, by the set of all objects that are not members of themselves. Call 
this set A, and ask “is A a member of itself?” If A ¢ A, then A meets the 
entrance requirement for A, whereupon A€ A. But on the other hand, if 
A€ A, then A fails to meet the entrance requirement and so A ¢ A. Thus 
both “Ae A” and “A¢ A” are untenable. Again, we are in trouble. The 
phrase “is not a member of itself” appears to be an illegal entrance require- 
ment for the abstraction method. (This example is known as Russell’s 
paradox. It was communicated by Bertrand Russell in 1902 to Gottlob 
Frege, and was published in 1903. The example was independently 
discovered by Ernst Zermelo.) 

These two sorts of disaster will be blocked in precise ways in our 
axiomatic treatment, and less formally in our nonaxiomatic treatment. The 
first sort of disaster (the Berry example) will be avoided by adherence 
to entrance requirements that can be stated in a totally unambiguous form, 
to be specified in the next chapter. The second sort of disaster will be 
avoided by the distinction between sets and classes. Any collection of sets 
will be a class. Some collections of sets (such as the collections @ and 
{@}) will be sets. But some collections of sets (such as the collection of all 
sets not members of themselves) will be too large to allow as sets. These 
oversize collections will be called proper classes. The distinction will be 
discussed further presently. 

In practice, avoidance of disaster will not really be an oppressive or 
onerous task. We will merely avoid ambiguity and avoid sweepingly vast 
sets. A prudent person would not want to do otherwise. 


Exercises 


1. Which of the following become true when “e” is inserted in place of the 
blank? Which become true when “&” is inserted? 


(a) {0} — {2 {O}}. 


(d) {2} — ia, Hoy. 
(e) Ø} — io, 1S, {oy 
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2. Show that no two of the three sets Ø, {2}, and {{@}} are equal to each 
other. 


3. Show that if BS C, then ABC AC. 
4. Assume that x and yare members ofa set B. Show that {{x}, {x, y}} € PAB. 


SETS—AN INFORMAL VIEW 


We are about to present a somewhat vague description of how sets are 
obtained. (The description will be repeated much later in precise form.) 
None of our later work will actually depend on this informal description, 
but we hope it will illuminate the motivation behind some of the things 
we will do. 


Fig. 2. V, is the set A of atoms. 


First we gather together all those things that are not themselves sets but 
that we want to have as members of sets. Call such things atoms. For 
example, if we want to be able to speak of the set of all two-headed 
coins, then we must include all such coins in our collection of atoms. 
Let A be the set of all atoms; it is the first set in our description. 

We now proceed to build up a hierarchy 


VEKSE 


of sets. At the bottom level (in a vertical arrangement as in Fig. 2) we take 
V = A, the set of atoms. The next level will also contain all sets of atoms: 


V,=V, UPV = AV PA. 
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The third level contains everything that is in a lower level, plus all sets of 
things from lower levels: 


Vi =V,UPY,. 
And in general 
Ve VU PV, 


Thus we obtain successively V,, V> Vys vee But even this infinite 
hierarchy does not include enough sets. For example, Ø € V,, {De V, 
HBI} € V,, etc., but we do not yet have the infinite set 


To remedy this lack, we take the infinite union 
yV, = V VU vV Ur, 
and then let V... = V. u AV, and we continue. In general for any a, 
ati (e w 
Va = hY PV 


and this goes on “forever.” Whenever you might think that the construction 
is finished, you instead take the union of all the levels obtained so far, take 
the power set of that union, and continue. ! 

A better explanation of the “forever” idea must be delayed until we 
discuss (in Chapter 7) the “numbers” being used as subscripts in the 
preceding paragraphs. These are the so-called “ordinal numbers.” The 
ordinal numbers begin with 0, 1, 2, ...; then there is the infinite number o, 
then w + 1, œ + 2, ...; and this goes on “forever.” 

A fundamental principle is the following: Every set appears somewhere 
in this hierarchy. That is, for every set a there is some a with ae V,,- 
That is what the sets are; they are the members of the levels of our 
hierarchy. 


Examples Suppose that a and b are sets. Say that a € V,,, and be Vn, 
and suppose that V,,, is “higher” in the hierarchy than V,,,. Then both 
a and bare in V,, ,, since each level includes all lower levels. Consequently 
in V,,, we have the pair set {a, b}. On the other hand at no point do we 
obtain a set of all sets, i.e., a set having all sets as members. There simply 


is no such set. 


There is one way in which we can simplify our picture. We were very 
indefinite about just what was in the set A of atoms. The fact of the matter 
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is that the atoms serve no mathematically necessary purpose, so we banish 
them; we take A = Ø. In so doing, we lose the ability to form sets of flowers 
or sets of people. But this is no cause for concern, we do not need set theory 
to talk about people and we do not need people in our set theory. But we 
definitely do want to have sets of numbers, e.g., {2, 3 + in}. Numbers do not 
appear at first glance to be sets. But as we shall discover (in Chapters 4 and 
5), we can find sets that serve perfectly well as numbers. 

Our theory then will ignore all objects that are not sets (as interesting 
and real as such objects may be). Instead we will concentrate just on “pure” 
sets that can be constructed without the use of such external objects. In 


Fig. 3. The ordinals are the backbone of the universe. 


particular, any member of one of our sets will itself be a set, and each of its 
members, if any, will be a set, and so forth. (This does not produce an 
infinite regress, because we stop when we reach Ø.) 

Now that we have banished atoms, the picture becomes narrower 
(Fig. 3). The construction is also simplified. We have defined V,,, to be 
V, U PV,. Now it turns out that this is the same as A U PV, (see Exercise 
6). With A = Ø, we have simply V,,, = PV,. 


Exercises 


5. Define the rank of a set c to be the least « such that cc V,. Compute 
the rank of {Ø}. Compute the rank of {Ø, 18} {W. DH 
6. We have stated that V ,, = Av PV, Prove this at least for a < 3. 


7. List all the members of V,. List all the members of V,. (It is to be 
assumed here that there are no atoms.) 
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CLASSES 


There is no “set of all sets,” i.e., there is no set having all sets as members. 
This is in accordance with our informal image of the hierarchical way sets 
are constructed. Later, the nonexistence of a set of all sets will become a 
theorem (Theorem 2A), provable from the axioms. 

Nonetheless, there is some mild inconvenience that results if we forbid 
ourselves even to speak of the collection of all sets. The collection cannot be 
a set, but what status can we give it? Basically there are two alternatives: 


The Zermelo-Fraenkel alternative The collection of all sets need have 
no ontological status at all, and we need never speak of it. When tempted 
to speak of it, we can seek a rephrasing that avoids it. 


The von Neumann-Bernays alternative The collection of all sets can be 
called a class. Similarly any other collection of sets can be called a class. 
In particular, any set is a class, but some classes are too large to be sets. 
Informally, a class A is a set if it is included in some level V, of our hierarchy 
(and then is a member of V, , ,). Otherwise it is not a set, and can never be 
a member of a set. 


For advanced works in set theory, the Zermelo-Fraenkel alternative 
seems to be the better of the two. It profits from the simplicity of having 
to deal with only one sort of object (sets) instead of two (classes and sets). 
And the circumlocutions it requires (in order to avoid reference to classes 
that are not sets) are things to which set-theorists learn early to adapt. 

For introductory works in set theory (such as this book), the choice 
between the two alternatives is less clear. The prohibition against mentioning 
any class that fails to be a set seems unnatural and possibly unfair. Desiring 
to have our cake and eat it too, we will proceed as follows. We officially 
adopt the Zermelo-Fraenkel alternative. Consequently the axioms and 
theorems shall make no mention of any class that is not a set. But in the 
expository comments, we will not hesitate to mention, say, the class of all 
sets if it appears helpful to do so. To avoid confusion, we will reserve upper- 
case sans serif letters (A, B, ...) for classes that are not guaranteed to be sets. 


AXIOMATIC METHOD 


In this book we are going to state the axioms of set theory, and we 
are going to show that our theorems are consequences of those axioms. The 
great advantage of the axiomatic method is that it makes totally explicit 
just what our initial assumptions are. 

It is sometimes said that “mathematics can be embedded in set theory.” 
This means that mathematical objects (such as numbers and differentiable 
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functions) can be defined to be certain sets. And the theorems of mathematics 
(such as the fundamental theorem of calculus) then can be viewed as state- 
ments about sets. Furthermore, these theorems will be provable from our 
axioms. Hence our axioms provide a sufficient collection of assumptions 
for the development of the whole of mathematics—a remarkable fact. (In 
Chapter 5 we will consider further the procedure for embedding mathematics 
in set theory.) 

The axiomatic method has been useful in other subjects as well as in set 
theory. Consider plane geometry, for example. It is quite possible to talk 
about lines and triangles without using axioms. But the advantages of 
axiomatizing geometry were seen very early in the history of the subject. 

The nonaxiomatic approach to set theory is often referred to as “naive 
set theory,” a terminology that does not hide its bias. Historically, set theory 
originated in nonaxiomatic form. But the paradoxes of naive set theory (the 
most famous being Russell’s paradox) forced the development of axiomatic 
set theory, by showing that certain assumptions, apparently plausible, were 
inconsistent and hence totally untenable. It then became mandatory to give 
explicit assumptions that could be examined by skeptics for possible 
inconsistency. Even without the foundational crises posed by the paradoxes 
of naive set theory, the axiomatic approach would have been developed to 
cope with later controversy over the truth or falsity of certain principles, 
such as the axiom of choice (Chapter 6). Of course our selection of 
axioms will be guided by the desire to reflect as accurately as possible our 
informal (preaxiomatic) ideas regarding sets and classes. 

It is nonetheless quite possible to study set theory from the nonaxiomatic 
viewpoint. We have therefore arranged the material in the book in a “two- 
tier” fashion. The passages dealing with axioms will henceforth be marked 
by a stripe in the left margin. The reader who omits such passages and 
reads only the unstriped material will thereby find a nonaxiomatic 
development of set theory. Perhaps he will at some later time wish to look at 
the axiomatic underpinnings. On the other hand, the reader who omits 
nothing will find an axiomatic development. Notice that most of the 
striped passages appear early in the book, primarily in the first three 
chapters. Later in the book the nonaxiomatic and the axiomatic approaches 
largely converge. 


Our axiom system begins with two primitive notions, the concepts of 
“set” and “member.” In terms of these concepts we will define others, 
but the primitive notions remain undefined. Instead we adopt a list of 
axioms concerning the primitive notions. (The axioms can be thought of 
as divulging partial information regarding the meaning of the primitive 
notions.) 
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Having adopted a list of axioms, we will then proceed to derive 
sentences that are logical consequences (or theorems) of the axioms. Here 
a sentence g is said to be a logical consequence of the axioms if any 
assignment of meaning to the undefined notions of set and member 
making the axioms true also makes o true. 

We have earlier sketched, in an informal and sometimes vague way, 
what “set” and “member” are intended to mean. But for a sentence to 
be a logical consequence of the axioms, it must be true whatever “set” 
and “member” mean, provided only that the axioms are true. The 
sentences that appear, on the basis of our informal viewpoint, as if they 
ought to be true, must still be shown to be logical consequences of the 
axioms before being accepted as theorems. In return for adopting this 
restriction, we escape any drawbacks of the informality and vagueness 
of the nonaxiomatic viewpoint. 

(There is an interesting point here concerning the foundations of 
mathematics. If ø is a logical consequence of a list of axioms, is there 
then a finitely long proof of ø from the axioms? The answer is affirmative, 
under a very reasonable definition of “proof.” This is an important result 
in mathematical logic. The topic is treated, among other places, in our 
book A Mathematical Introduction to Logic, Academic Press, 1972.) 

For example, the first of our axioms is the axiom of extensionality, 
which is almost as follows: Whenever A and B are sets such that exactly 
the same things are members of one as are members of the other, then 
A= B. Imagine for the moment that this were our only axiom. We 
can then consider some of the logical consequences of this one axiom. 

For a start, take the sentence: “There cannot be two different sets, 
each of which has no members.” This sentence is a logical consequence 
of extensionality, for we claim that any assignment of meaning to “set” 
and “member” making extensionality true also makes the above sentence 
true. To prove this, we argue as follows. Let A and B be any sets, each 
of which has no members. Then exactly the same things belong to A 
as to B, since none belong to either. Hence by extensionality, A = B. 
(The validity of this argument, while independent of the meaning of 
“set” or “member,” does depend on the meaning of the logical terms 
such as “each,” “no,” “equal,” etc.) 

On the other hand, consider a simple sentence such as a: “There are 
two sets, one of which is a member of the other.” This sentence is not 
a logical consequence of extensionality. To see this, let the word “set” 
mean “a number equal to 2,” and let “member of” mean “unequal to.” 
Under this interpretation, extensionality is true but ø is false. Of course 
we will soon add other axioms of which a will be a logical consequence. 
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NOTATION 


To denote sets we will use a variety of letters, both lowercase (a, b, ...), 
uppercase (A, B, ...), and even script letters and Greek letters. Where 
feasible, we will attempt to have the larger and fancier letters denote sets 
higher in our hierarchy of levels than those denoted by smaller and plainer 
letters. In addition, letters can be embellished with subscripts, primes, and 
the like. This assures us of an inexhaustible supply of symbols for sets. In 
other words, when it comes to naming sets, anything goes. 

It will often be advantageous to exploit the symbolic notation of 
mathematical logic. This symbolic language, when used in judicious amounts 
to replace the English language, has the advantages of both conciseness 
(so that expressions are shorter) and preciseness (so that expressions are 
less ambiguous). 

The following symbolic expressions will be used to abbreviate the 
corresponding English expressions: 


Wx for every set x 
Jx there exists a set x such that 


7 not 

& and 

or or (in the sense “one or the other or both”) 

> implies (“— => —” abbreviates “if __, then _”) 
> if and only if, also abbreviated “iff” 


You have probably seen these abbreviations before; they are discussed in 
more detail in the Appendix. 

We also have available the symbols e and = (and ¢ and +, although 
we could economize by eliminating, e.g., “a ¢ B” in favor of “nae B”). 
With all these symbols, variables (a, b, ...), and parentheses we could avoid 
the English language altogether in the statement of axioms and theorems. 
We will not actually do so, or at least not all at once. But the fact that 
we have this splendid formal language available is more than a theoretical 
curiosity. When we come to stating certain axioms, the formal language will 
be a necessity. 

Notice that we read Vx as “for all sets x,” rather than “for all things x.” 
This is due to our decision to eliminate atoms from our theory. A result 
of this elimination is that everything we consider is a set; e.g., every member 
of a set will itself be a set. 


Example The principle of extensionality can be written as 


VA WB[(A and B have exactly the same members) => A= Bl. 
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Then “A and B have exactly the same members” can be written as 
Vx(xe A < xeB) 
so that extensionality becomes 
VAVB[Vx(xe A <> xeB) = A=B] 
Example “There is a set to which nothing belongs” can be written 
3B Vx x ¢ B. 


These two examples constitute our first two axioms. It is not really 
necessary for us to state the axioms in symbolic form. But we will seize 
the opportunity to show how it can be done, on the grounds that the 
more such examples you see, the more natural the notation will become 
to you. 


HISTORICAL NOTES 


The concept of a set is very basic and natural, and has been used in 
mathematical writings since ancient times. But the theory of abstract sets, 
as objects to be studied for their own interest, was originated largely by 
Georg Cantor (1845-1918). Cantor was a German mathematician, and his 
papers on set theory appeared primarily during the period from 1874 to 
1897. 

Cantor was led to the study of set theory in a very indirect way. He 
was studying trigonometric series (Fourier series), e.g., series of the form 


a, sin x + a, sin 2x + a, sin 3x feet. 


Such series had been studied throughout the nineteenth century as solutions 
to differential equations representing physical problems. Cantor’s work on 
Fourier series led him to consider more and more general sets of real 
numbers. In 1871 he realized that a certain operation on sets of real numbers 
(the operation of forming the set of limit points) could be iterated more 
than a finite number of times; starting with a set P, one could form 
Py, P,P, ++> Pa Posy Poro In December of 1873 Cantor 
proved that the set of all real numbers could not be put into one-to-one 
correspondence with the integers (Theorem 6B); this result was published 
in 1874. In 1879 and subsequent years, he published a series of papers 
setting forth the general concepts of abstract sets and “transfinite numbers.” 

Cantors work was well received by some of the prominent mathe- 
maticians of his day, such as Richard Dedekind. But his willingness to 
regard infinite sets as objects to be treated in much the same way as finite 
sets was bitterly attacked by others, particularly Kronecker. There was no 
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objection to a “potential infinity” in the form of an unending process, but 
an “actual infinity” in the form of a completed infinite set was harder 
to accept. 

About the turn of the century, attempts were made to present the 
principles of set theory as being principles of logic—as self-evident truths 
of deductive thought. The foremost work in this direction was done by 
Gottlob Frege. Frege was a German mathematician by training, who 
contributed to both mathematics and philosophy. In 1893 and 1903 he 
published a two-volume work in which he indicated how mathematics 
could be developed from principles that he regarded as being principles 
of logic. But just as the second volume was about to appear, Bertrand 
Russell informed Frege of a contradiction derivable from the principles 
(Russell’s paradox). 

Russell’s paradox had a tremendous impact on the ideas of that time 
regarding the foundations of mathematics. It was not, to be sure, the 
first paradox to be noted in set theory. Cantor himself had observed 
that some collections, such as the collection of all sets, had to be regarded 
as “inconsistent totalities,” in contrast to the more tractable “consistent 
totalities,” such as the set of numbers. In this he foreshadowed the 
distinction between proper classes and sets, which was introduced by John 
von Neumann in 1925. Also in 1897, Cesare Burali-Forti had observed a 
paradoxical predicament in Cantor’s theory of transfinite ordinal numbers. 
But the simplicity and the directness of Russell’s paradox seemed to destroy 
utterly the attempt to base mathematics on the sort of set theory that 
Frege had proposed. 

The first axiomatization of set theory was published by Ernst Zermelo 
in 1908. His axioms were essentially the ones we state in Chapters 2 and 4. 
(His Aussonderung axioms were later made more precise by Thoralf Skolem 
and others.) It was observed by several people that for a satisfactory theory 
of ordinal numbers, Zermelo’s axioms required strengthening. The axiom of 
replacement (Chapter 7) was proposed by Abraham Fraenkel (in 1922) and 
others, giving rise to the list of axioms now known as the “ Zermelo- 
Fraenkel” (ZF) axioms. The axiom of regularity or foundation (Chapter 7) 
was at least implicit in a 1917 paper by Dmitry Mirimanoff and was 
explicitly included by von Neumann in 1925. 

An axiomatization admitting proper classes as legitimate objects was 
formulated by von Neumann in the 1925 paper just mentioned. Some of 
his ideas were utilized by Paul Bernays in the development of a more 
satisfactory axiomatization, which appeared in a series of papers published 
in 1937 and later years. A modification of Bernays’s axioms was used by 
Kurt Gödel in a 1940 monograph. This approach is now known as “von 
Neumann-Bernays” (VNB) or “Gédel-Bernays” (GB) set theory. 
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The use of the symbol e€ (a stylized form of the Greek epsilon) to denote 
membership was initiated by the Italian mathematician Giuseppe Peano 
in 1889. It abbreviates the Greek word éoti, which means “is.” The under- 
lying rationale is illustrated by the fact that if B is the set of all blue objects, 
then we write “x e B” in order to assert that x is blue. 

Present-day research in set theory falls for the most part into two 
branches. One branch involves investigating the consequences of new and 
stronger axioms. The other branch involves the “metamathematics” of set 
theory, which is the study not of sets but of the workings of set theory 
itself: its proofs, its theorems, and its nontheorems. 


CHAPTER 2 


AXIOMS AND OPERATIONS 


In this chapter we begin by introducing the first six of our ten axioms. 
Initially the axiomatization might appear to be like cumbersome 
machinery to accomplish simple tasks. But we trust that it will eventually 
prove itself to be powerful machinery for difficult tasks. Of course the 
axioms are not chosen at random, but must ultimately reflect our informal 
ideas about what sets are. 


In addition to the introduction of basic concepts, this chapter provides 
practice in using the symbolic notation introduced in Chapter 1. Finally 
the chapter turns to the standard results on the algebra of sets. 


AXIOMS 
The first of our axioms is the principle of extensionality. 


Extensionality Axiom If two sets have exactly the same members, 
then they are equal: 


VAVBIVx(xe A < xeB) > A=B]. 
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The above symbolic rendering of extensionality is the one we developed 
previously. We will supply similar symbolizations for the other axioms. 
The more practice the better! 

Next we need some axioms assuring the existence of some basic sets 
that were encountered informally in the preceding chapter. 


Empty Set Axiom There is a set having no members: 
IB Vx x € B. 


Pairing Axiom For any sets u and v, there is a set having as members © 
just u and v: 


Vu Yv IB Yx(xe B <> x =uorx=?). 


Union Axiom, Preliminary Form For any sets a and b, there is a set 
whose members are those sets belonging either to a or to b (or both): 


Va Yb IBVx(xeB <= xeaorxed). 


Power Set Axiom For any set a, there is a set whose members are 
exactly the subsets of a: 


Va IBYx(xeB <= xa). 


Here we can, if we wish, rewrite “x © a” in terms of the definition 
of gE: 
Vitex = tea). 


Later we will expand this list to include 


subset axioms, replacement axioms, 
infinity axiom, regularity axiom, 
choice axiom. 


Also the union axiom will be restated in a stronger form. (Not all of these 
axioms are really necessary; some will be found to be redundant.) 

The set existence axioms can now be used to justify the definition of 
symbols used informally in Chapter 1. First of all, we want to define the 
symbol “ g.” 


Definition Ø is the set having no members. 


This definition bestows the name “g” on a certain set. But when we 
write down such a definition there are two things of which we must be 
sure: We must know that there exists a set having no members, and 
we must know that there cannot be more than one set having no members. 
The empty set axiom provides the first fact, and extensionality provides 
the second. Without both facts the symbol “ g” would not be well defined. 
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Severe logical difficulties arise from introducing symbols when either there 
is no object for the symbol to name, or (even worse) the symbol names 
ambiguously more than one object. 

The other set existence axioms justify the definition of the following 
symbols. 


Definition (i) For any sets u and v, the pair set {u, v} is the set whose 
only members are u and v. 

(ii) For any sets a and b, the union a U b is the set whose members 
are those sets belonging either to a or to b (or both). 

(iii) For any set a, the power set Pa is the set whose members are 
exactly the subsets of a. 


As with the empty set, our set existence axioms assure us that the 
sets being named exist, and extensionality assures us that the sets being 
named are unique. 


We can use pairing and union together to form other finite sets. First of 
all, given any x we have the singleton {x}, which is defined to be {x, x}. 
And given any x,, X3, and x, we can define 


{xi X2 X3} = {X15 Xp} Y {xg} 
Similarly we can define {x,, X3, X3, X4} and so forth. 


Having defined the union operation, we should accompany it by the 
intersection operation. But to justify the definition of intersection we need 
new axioms, to which we now turn. In the next few paragraphs we shall 
use our informal view of sets to motivate the formulation of these axioms. 

Observe that our set existence axioms contain expressions like “there 
is a set B whose members are those sets x satisfying the condition _,” 
where the blank is filled by some condition specifying which sets we want. 
In symbols this becomes 


IBVx(xeB <= _). 


If the axiom mentions some other sets t}, ..., fp, then the full version 
becomes 


Vt, °°: Vt, IBVx(xeB <> _) 


with the blank filled by some expression involving ¢,,..-, tp» and x. The 
empty set axiom is not quite in this form, but it can be rewritten as 


IBVx(xeB < x#x), 


which is in the above form (with k = 0). The set B whose existence is 
asserted by such an axiom is (by extensionality) uniquely determined 
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by f- tj» SO we can give it a name (in which the symbols 
ti» +++ t, appear). This is just what we have done. 

Now let us try to be more general and consider any sentence o of 
the form 


Vt, °° Wt, IBVx(xeB = _), 


where the blank is filled by some expression involving at most f,, ..., ths 
and x. If this sentence is true, then the set B could be named by use 
of the abstraction notation of Chapter 1: 
B={x|_}. 
The sets recently defined can be named by use of the abstraction 
notation: 
Ø = {x|x# x}, 
{u, v} = {x |x = u or x =v}, 
au b={x|xeaorxe bj, 
Pa = {x |x Sa}. 
One might be tempted to think that any sentence o of the form 
Vt, Vt, IBVx(xeB <= _) 


should be adopted as true. But this is wrong; some sentences of this form 
are false in our informal view of sets (Chapter 1). For example, 


3BYx(xeB <> x=x) 


is false, since it asserts the existence of a set B to which every set belongs. 
The most that can be said is that there is a class A (but not necessarily a 
set) whose members are those sets x such that —: 


A= {x | _}. 


In order for the class A to be a set it must be included in some level 
V, of the hierarchy. In fact it is enough for A to be included in any set c, 
for then 


Acc& Vy, 
for some a, and from this it follows that Ae V,,1- 


All this is to motivate the adoption of the subset axioms. These 
axioms say, very roughly, that any class A included in some set c must 
in fact be a set. But the axioms can refer (in the Zermelo-Fraenkel 
alternative) only to sets. So instead of direct reference to the class A, we 
refer instead to the expression — that defined A. 
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Subset Axioms For each formula __ not containing B, the following 
is an axiom: 


Yt ee Vt, Ve IBVx(xeB <> xec& 


In English, the axiom asserts (for any t,,..., tę and c) the existence 
of a set B whose members are exactly those sets x in c such that _. 
It then follows automatically that B is a subset of c (whence the name 


“subset axiom”). The set B is uniquely determined (by t,,..., t and c), 
and can be named by use of a variation on the abstraction notation: 
B= {xec|_}. 


Example One of the subset axioms is: 
Va Yc IB Yx(xe B = xec&xea). 


This axiom asserts the existence of the set we define to be the intersection 
coaofcand a. 


We are not tied to one particular choice of letters. For example, we 
will also allow as a subset axiom: 


VAVBASVi[teS «= teA&te¢ Bl. 
This set S is the relative complement of B in A, denoted A — B. 


Note on Terminology The subset axioms are often known by the 
name Zermelo gave them, Aussonderung axioms. The word Aussonderung 
is German, and is formed from sonderen (to separate) and aus (out). 


Example In Chapter 4 we shall construct the set œw of natural numbers: 
œ = {0, 1, 2,...}. 


We will then be able to use the subset axioms to form the set of even 
numbers and the set of primes: 


fxew|xiseven} and {yeo |y is prime}. 


(But to do this we must be able to express “x is even” by means of a 
legal formula; we will return to this point shortly.) 


Example Let s be some set. Then there is a set Q whose members are 
the one-element subsets of s: 


Q = {ae Ps | a is a one-element subset of s}. 


We can now use the argument of Russell’s paradox to show that the 
class V of all sets is not itself a set. 
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Theorem 2A There is no set to which every set belongs. 
Proof Let A be a set; we will construct a set not belonging to A. Let 
B={xeA|x€x}. 
We claim that B ¢ A. We have, by the construction of B, 
BeB = BEeA& BEB. 
If Be A, then this reduces to 
BeB < Bé¢B, 


which is impossible, since one side must be true and the other false. Hence 
Bé¢ A. 4 


One might ask whether a set can ever be a member of itself. We will 
argue much later (in Chapter 7) that it cannot. And consequently in the 
preceding proof, the set B is actually the same as the set A. 


At this point we need to say just what a formula is. After all, it 
would be most unfortunate to have as one of the subset axioms 


IBYx(xeB <> xew &x isan integer definable in one line of type). 


We are saved from this disaster by our logical symbols. By insisting that 
the formula be expressible in the formal language these symbols give us, 
we can eliminate “x is an integer definable in one line of type” from the 
list of possible formulas. (Moral: Those symbols are your friends!) 

The simplest formulas are expressions such as 


ae B and a=b 


(and similarly with other letters). More complicated formulas can then be 
built up from these by use of the expressions 


Vx, dx, 1, & or, >, >, 


together with enough parentheses to avoid ambiguity. That is, from 
formulas ọ and y we can construct longer formulas Yx @, 3x ọ (and 


similarly Vy @, etc.), (79), (@ & Y), (p or Y), (p= Y), and (p= 4). We 
define a formula to be a string of symbols constructed from the simplest 
formulas by use of the above-listed methods. For example, 


ax(xe A & Vt(tex. = (“te A))) 


is a formula. In practice, however, we are likely to abbreviate it by 
something a little more readable, such as 


(axe A)(Vte x) t¢ A. 
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An ungrammatical string of symbols such as )) = A is not a formula, 
nor is 


x is an integer definable in one line of type. 


Example Let s be some set. In a previous example we formed the set 
of one-element subsets of s: 


{a e Ps | a is a one-element subset of s}. 


Now “a is a one-element subset of s” is not itself a formula, but it can 
be rewritten as a formula. As a first step, it can be expressed as 


ag s&az+ Ø & any two members of a coincide. 
This in turn becomes the formula 
((Vx(xea = xes)& Jyyea)&WuVo((uea&vea) => u=v)). 


In applications of subset axioms we generally will not write out the 
formula itself. And this example shows why; “a is a one-element subset 
of s” is much easier to read than the legal formula. But in every case 
it will be possible (for a person with unbounded patience) to eliminate 
the English words and the defined symbols (such as Ø, L, and so forth) 
in order to arrive at a legal formula. The procedure for eliminating defined 
symbols is discussed further in the Appendix. 
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The union operation previously described allows us to form the union 
a b of two sets. By repeating the operation, we can form the union of 
three sets or the union of forty sets. But suppose we want the union of 
infinitely many sets; suppose we have an infinite collection of sets 


A = fbas bp bys.) 


and we want to take the union of all the b;s. For this we need a more 
general union operation: 


UA= Ub; 
= {x | x belongs to some member b; of A}. 


This leads us to make the following definition. For any set A, the union 
(JA of A is the set defined by 


| )A = {x | x belongs to some member of A} 
= {x | (ab e A) x € b}. 
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Thus (JA is a melting pot into which all members of A are dumped. 
For example, suppose Un is (as on p. 4) the set of countries belonging 
to the United Nations. Then (J Un is the set of all people that are citizens 
of some country belonging to the United Nations. A smaller example 
(and one that avoids sets of people) is 


| J{{2, 4, 6}, {6, 16, 26}, {O}}. 


You should evaluate this expression to make certain that you understand 
the union operation. If you do it correctly, you will end up with a set 
of six numbers. 


We need an improved version of the union axiom in order to know 
that a set exists containing the members of the members of A. 


Union Axiom For any set A, there exists a set B whose elements are 
exactly the members of the members of A: 
Vx[xeB < (abe A) xe}. 
We can state the definition of |) A in the following form: 
xe\JA @ (abe A)xed. 
For example, 
| }{a, b} = {x | x belongs to some member of {a, b}} 
= {x | x belongs to a or to b} 
=avub. 
This example shows that our preliminary form of the union axiom 
can be discarded in favor of the new form. That is, the set a U b produced 


by the preliminary form can also be obtained from pairing and the 
revised form of the union axiom. 


Similarly we have 
(Jla, b,c, d}=aubucud and | {a} =a. 


An extreme case is | )@ = Ø. 

We also want a corresponding generalization of the intersection 
operation. Suppose we want to take the intersection of infinitely many 
sets by, b,, .... Then where 


A = {bas bis} 
the desired intersection can be informally characterized as 


N4=()5 


= {x | x belongs to every b; in A}. 
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In general, we define for every nonempty set A, the intersection (\A of A 
by the condition 


xé(\A < x belongs to every member of A. 


In contrast to the union operation, no special axiom is needed to 
justify the intersection operation. Instead we have the following theorem. 


Theorem 2B For any nonempty set A, there exists a unique set B 
such that for any x, 


xeB < x belongs to every member of A. 
This theorem permits defining (\A to be that unique set B. 


Proof Weare given that A is nonempty; let c be some fixed member 
of A. Then by a subset axiom there is a set B such that for any x, 
xeB < xec& x belongs to every other member of A 
<> x belongs to every member of A. 


Uniqueness, as always, follows from extensionality. 4 


Examples 


(VU, 2, 8}, {2, 8}, {4, 83} = {8}, 
UHI, 2, 8}, (2, 8}, {4, 83} = {1, 2, 4, 8}. ` 


Examples 


Nla} = a, 
- (\a, b} =a b, 
(Na, b,ch=anbace. 


In these last examples, as A becomes larger, ()A gets smaller. More 
precisely: Whenever A S B, then ()B = ()A. There is one troublesome 
extreme case. What happens if A = @? For any x at all, it is vacuously 
true that x belongs to every member of @. (There can be no member of @ 
to which x fails to belong.) Thus it looks as if ()Ø should be the class V 
of all sets. By Theorem 2A, there is no set C such that for all x, 


xeC <>. x belongs to every member of Ø 


since the right side is true of every x. This presents a mild notational 
problem: How do we define ()@? The situation is analogous to division 
by zero in arithmetic. How does one define a + 0? One option is to leave 
(\@ undefined, since there is no very satisfactory way of defining it. This 
option works perfectly well, but some logicians dislike it. It leaves (\@ as 
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an untidy loose end, which they may later trip over. The other option is 
to select some arbitrary scapegoat (the set @ is always used for this) and 
define ()@ to equal that object. Either way, whenever one forms (A one 
must beware the possibility that perhaps A = Ø. Since it makes no difference 
which of the two options one follows, we will not bother to make a choice 
between them at all. 


Example If be A, then b S (JA. 


Example Tf {{x}, {x, y}} € A, then {x, y} € (JA, xe UUA, and ye UA. 
Example (\{{a}, {a, b}} = {a} 9 {a, b} = {a}. Hence 
UN a} {a b} = Ula} = a. 


On the other hand, 
NUHA, {a by} = Nla bj = an b. 
Exercises 
See also the Review Exercises at the end of this chapter. 


1. Assume that A is the set of integers divisible by 4. Similarly assume 
that B and C are the sets of integers divisible by 9 and 10, respectively. 
What is in AN BOC? 


2. Give an example of sets A and B for which | JA = (JB but A # B. 


3. Show that every member of a set A is a subset of | JA. (This was 
stated as an example in this section.) 


4, Show that if A S B, then (JAC UB. 
5, Assume that every member of Z is a subset of B. Show that |] = B. 


6. (a) Show that for any set A, | JPA = A. 
(b) Show that A = Y\ JA. Under what conditions does equality hold? 


7. (a) Show that for any sets A and B, 
PAn PB=P(An B). 
(b) Show that PAU PBS P(A V B). Under what conditions does 
equality hold? 

8. Show that there is no set to which every singleton (that is, every set 
of the form {x}) belongs. [Suggestion: Show that from such a set, we could 
construct a set to which every set belonged.] 

9. Give an example of sets a and B for which ae B but Pa ¢ PB. 


10. Show that if ae B, then Pac PA JB. [Suggestion: If you need help, 
look in the Appendix.] 
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‘ALGEBRA OF SETS 


Two basic operations on sets are the operations of union and intersection: 


Au B={x|xeAor xe B}, 
An B={x|xeA&xe Bi. 


Also we have for any sets A and B the relative complement A — B of B in A: 
A~B={xeA|x¢B}. 


The usual diagram for A—B is shown in Fig. 4. In some books the 
minus sign is needed for other uses, and the relative complement is then 
denoted A \ B. 


Fig. 4. The shaded area represents A — B. 


The union axiom was used to give us A U B. But A ^ Band A — B 
were both obtained from subset axioms. 


We cannot form (asa set) the “absolute complement” of B, i.e., {x | x ¢ B}. 
This class fails to be a set, for its union with B would be the class of all 
sets. In any event, the absolute complement is unlikely to be an interesting 
object of study. ` 

For example, suppose one is studying sets of real numbers. Let R be the 
set of all real numbers, and suppose that B © R. Then the relative comple- 
ment R — B consists of these real numbers not in B. On the other hand, 
the absolute complement of B would be a huge class containing all manner 
of irrelevant things; it would contain any set that was not a real number. 


Example Let A be the set of all left-handed people, let B be the set of 
all blond people, and let C be the set of all male people. (We choose to 
suppress in this example, as in others, the fact that we officially banned 
people from our sets.) Then A ùu (B — C) is the set of all people who 
either are left-handed or are blond nonmales (or both). On the other hand 
(A U B) — C is the set of all nonmales who are either left-handed or blond. 
These two sets are different; Joe (who is a left-handed male) belongs to the 
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first set but not the second. The set 
(A-C)u (B-—C) 
is the same as one of the two sets mentioned above. Which one? 


The study of the operations of union (U), intersection (^), and relative 
complementation (—), together with the inclusion relation (€), goes by the 
name of the algebra of sets. In some ways, the algebra of sets obeys laws 
reminiscent of the algebra of real numbers (with +,°, —, and <), but 
there are significant differences. 

The following identities, which hold for any sets, are some of the 
elementary facts of the algebra of sets. 


Commutative laws 
AUB=BUA and AN B=BOA. 
Associative laws 
AU (BUC)=(AU B)UC, 
AN(BOC)=(ANB)NC. 
Distributive laws 
AN (BUC)=(ANB)VU (ANC), 
AU (BOC)=(AUB) (AUC). 
De Morgar’s laws 
C—(Avu B)=(C— A)n (C— B} 
C- (An^ B)=(C— A) vu (C-B). 
Identities involving Ø 
AUG=A and An Ø =Ø, 
An (C-A) = Ø. 


Often one considers sets, all of which are subsets of some large set or 
“space” S. A common example is the study of subsets of the space R of 
real numbers. Assume then that A and B are subsets of S. Then we can 
abbreviate S — A as simply — A, the set S being understood as fixed. In this 
abbreviation, De Morgan’s laws become 


—(Avu B)=—-An —B, 
—(An B)=—-Av —B. 
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Further, we have (still under the assumption that 4 € S) 
AUS=S and ANS=A, 
Au -A=S and An -—A=@. 


Now we should say something about how one proves all these facts. 
Let us take as a sample the distributive law: 


An(BUC)=(ANB)vU (ANC). 
One way to check this is to draw the picture (Fig. 5). After shading 


Fig. 5. Diagram for three sets. 


the region representing A N^ (B o C)and the region representing (An BU 
(A n C), one discovers that these regions are the same. 

Is the foregoing proof, which relies on a picture, really trustworthy? 
Let us run through it again without the picture. To prove the desired 
equation, it suffices (by extensionality) to consider an arbitrary x, and to 
show that x belongs to A ^ (B u C) iff it belongs to (A ^ B) Y (An ©). 
So consider this arbitrary x. We do not know for sure whether x € A or 
not, whether x € B or not, ete., but we can list all eight possibilities: 

xéA xeB xeC 
xeA xeB x€C 
xed xB xeC 
xeA xB x€C 
x€A xeB xeC 
x¢A xeB x€C 
x€A xB xeC 
x€A x€B x€C. 
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(These cases correspond to the eight regions of Fig. 5.) We can then verify 
that in each of the eight cases, 


xeAn(BUC) iff xe(An B)U (ANC). 
For example, in the fifth case we find that 
x¢AN(BUC) and x¢(AN B)U (ANC). 


(What region of Fig. 5 does this case represent?) When verification for 
the seven other cases has been made, the proof of the equation is complete. 

This method of proof is applicable to all the equations listed thus far; 
in fact it works for any equation or inclusion of this sort. If the equation 
involves n letters, then there will be 2” cases. In the distributive law we 
had three letters and eight cases. But less mechanical methods of proof 
will be needed for some of the facts listed below. 

For the inclusion relation, we have the following monotonicity properties: 


ACB => AVCEBUG, 
ASB >» ANCEBOS, 
ASB => (JA S | )B, 
and the “antimonotone” results: 
Q#ASB => ()\BS()A 
In each case, the proof is straightforward. For example, in the last case, 
we assume that every member of A is also a member of B. Hence if 
xE (\B, i.e. if x belongs to every member of B, then a fortiori x belongs 
to every member of the smaller collection A. And consequently x € NA. 


Next we list some more identities involving arbitrary unions and 
intersections. 


Distributive laws 
AU(\B=(YAYX|XE BA} for BED, 
AnUB={AnX |Xe B}. 


The notation used on the right side is an extension of the abstraction 
notation. The set {A U X | X e B} (read “the set of all AU X such that 
X e B”) is the unique set 2 whose members are exactly the sets ef the 
form A u X for some X in 2; i.e., \ 


teQ < t=AUX forsome X in 2. 
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The existence of such a set 2 can be proved by observing that 
AU XS A u |J@. Hence the set 2 we seek is a subset of P(A ù (J2). 
A subset axiom produces 


{te P(A o |]J) |t = Av X for some X in 2}, 
and this is exactly the set 2. 


For another example of this notation, suppose that sets «/ and C are 
under consideration. Then 


{C-X|Xex#} 
is the set of relative complements of members of 7, i.e., for any t, 
te{C-X|Xev} = t=C-X forsome X in g. 
Similarly, {PX | X € A} is the set for which 
te{PX|Xead} = t=PX forsome X in æ. 


It is not entirely obvious that any such set can be proved to exist, but 
see Exercise 10. 


De Morgans laws (for £ # Ø) 
C-| Jy =(\{C- X | Xe ¥}, 
C-A =\({C-—X | Xe a}. 
If Ua © S, then these laws can be written as 
-U4 = (HX |X e 4), 
-Nx =|{-X | Xe}, 


where it is understood that — X is S — X. 
To prove, for example, that for nonempty .o% the equation 


C-Uw =(\ic-x|xead} 
holds, we can argue as follows: 


tec -Ux = teC butt belongs to no member of 7 
=> teC—-X for every X in £ 
=> te(\{C-X|Xe x}. 


Furthermore every step reverses, so that “=>” can become “<>.” (A question 
for the alert reader: Where do we use the fact that £ # Ø?) 
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A Final Remark on Notation There is another style of writing some of 
the unions and intersections with which we have been working. We can 
write, for example, 


N (AUX) for (YAU X|X eB} 
and 

U (C-—X) for ic — X | Xe oH}. 

Xe 


But for the most part we will stick to our original notation. 


Exercises 
ji. Show that for any sets A and B, 
A=(AnB)U(A-B) and Au (B-A)=AVB. 
12. Verify the following identity (one of De Morgan’s laws): 
/ C- (An B) = (C-A) v (C — B). 
13. Show that if Ac B, then C — BE C — A. 


14. Show by example that for some sets A, B, and C, the set A — (B-C) 
is different from (A — B) — C. 
15. Define the symmetric difference A + B of sets A and B to be the set 
(A — B) o (B - A). 
a) Show that An (B + C)=(40 B)+ (40C). 
os Show that A + (B + C) = (4 + B) + C. 


16. Simplify: 
{4u Bu C)a (AV B) -[(49 (B — C)) ^ Al]. 
17. Show that the following four conditions are equivalent. 
(a) ASB, (b) A- B= Ø, 
(c) AUB=B, (d) AnB=A. , 
18. Assume that A and B are subsets of S. List all of the different sets 


that can be made from these three by use of the binary operations U, N, 
and — 


19. Is PA- B) always equal to PA — PB? Is it ever equal to PA— PB) 


20. Let A, B,and C be sets such that AU B= AUCandANB=ANC. 
Show that B = C. 


21. Show that ( (4 o B) = YA v UB. 
22. Show that if A and B are nonempty sets, then (\(A o B) = 4 o ()B. 
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23. Show that if Z is nonempty, then AU (\# = (\{AU X | Xe B}. 


24. (a) Show that if is nonempty, then Pf). = (\{PX | X € 2}. 
(b) Show that 


(HPX | Xe aw} Ax. 
Under what conditions does equality hold? 


25. Is AU \|)B always the same as | ){A U X|X € 2}? If not, then under 
what conditions does equality hold? 


EPILOGUE 


This chapter was entitled “Axioms and Operations.” Some of the 
operations featured in the chapter are union, intersection, the power set 
axiom, and relative complementation. You should certainly know what all 
of these operations are and be able to work with them. You should also be 
able to prove various properties of the operations. 


Also in this chapter we introduced six of our ten axioms. Or more 
accurately, we introduced five axioms and one axiom schema. (An axiom 
schema is an infinite bundle of axioms, such as the subset axioms.) 
From these axioms we can justify the definitions of the operations 
mentioned above. And the proofs of properties of the operations are, 
ultimately, proofs from our list of axioms. 


Review Exercises 

26. Consider the following sets: A = {3, 4}, B = {4, 3} U Ø, C = {4, 3} U 
{D} D={x|x?-7x+12=0}, E={Ø, 3, 4, F={4, 4, 3}, G= 
{4, @, @, 3}. For each pair of sets (e.g, A and E) specify whether or not 
the sets are equal. 


27. Give an example of sets A and B for which A ^ B is nonempty and 


QA (\BE(\(Ac B). 
28. Simplify: 


UHS, 4, {3}, (4), 3, 48, U3}, 43}. 
29. Simplify: 
(a) PPPD, PPD, PS, Sh. 
b) PPPD PAD}, AD}. 
30. Let A be the set {{@}, {{@}}}. Evaluate the following: 
@) 4, b) UA ©) PUA @) UPA. 
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31. Let B be the set {{1, 2}, (2, 3}, {1, 3}, {Ø}. Evaluate the following sets. 
@) UB) NB  @ NUB @ UNB. 

32. Let S be the set {{a}, {a, b}}. Evaluate and simplify: 
@ UUS, 0) ANS. = ©) NUS v (UUS - UNS). 

33, With S as in the preceding exercise, evaluate UUS — NS) when 

a + b and when a = b. 

34. Show that {@, {B}; e PPPS for every set S. 

35, Assume that PA = PB. Prove that A = B. 


36. Verify that for all sets the following are correct. 
(a) A-(AnB)=A-B. 
(b) A-(A-B)=ANB. 
37. Show that for all sets the following equations hold. 
(a) (Au B)- C= (A-C) v (B-C). 
(b) A-(B—C)=(A-B)v (ANC). 
(c) (A—B)-C=A-(BUC). 
38. Prove that for all sets the following are valid. 
(a) ASC&BCC = AVBEC, 
(b) CSA&CESB > CEANB. 


CHAPTER 3 


RELATIONS AND FUNCTIONS 


In this chapter we introduce some concepts that are important throughout 
mathematics. The correct formulation (and understanding) of the definitions 
will be a major goal. The theorems initially will be those needed to 
justify the definitions, and those verifying some properties of the defined 
objects. 


ORDERED PAIRS 


The pair set {1, 2} can be thought of as an unordered pair, since {1, 2} = 
{2, 1}. We will need another object <1, 2> that will encode more information: 
that 1 is the first component and 2 is the second. In particular, we will 
demand that <1, 2> # <2, 1). 

More generally, we want to define a set <x, yò that uniquely encodes 
both what x and y are, and also what order they are in. In other words, 
if an ordered pair can be represented in two ways 


<x, y> = CU, v), 


then the representations are identical in the sense that x =u and y =v. 
And in fact any way of defining <x, y> that satisfies this property of unique 


35 
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decomposition will suffice. It will be instructive to consider first some 
examples of definitions lacking this property. 


Example 1 If we define <x, y>, ={x, y}, then (as noted above), 
<1, 2); = <2, Dy. 

Example2 Let <x, y>, = {x, {y}. Again the desired property fails, since 
{{B}, {DB}. = <{{D}, D>, both sides being equal to {8}, {{O}}}. 


The first successful definition was given by Norbert Wiener in 1914, who 
proposed to let 


<x, V3 = KO}, D} (OPE 


A simpler definition was given by Kazimierz Kuratowski in 1921, and is the 
definition in general use today: 


Definition <x, y> is defined to be {{x}, {x, y}}. 


We must prove that this definition succeeds in capturing the desired 
property: The ordered pair <x, y> uniquely determines both what x and y 
are, and the order upon them. 

Theorem 3A <u, vò = <x, y) iff u = x and v = y. 


Proof One direction is trivial; if u = x and v = y, then <u, vò is the 
same thing as <x, yò. 
To prove the interesting direction, assume that <u, v> = <x, y), Le., 


{w tu of} = {0c}, bx yp. 


Then we have 


{uje {x}, {xy} and fu, v} e {fx}, [x y9} 


From the first of these, we know that either 
(a) {w={x} or (b) {w} = tx y} 
and from the second we know that either 
(c) {uy o}={x} or (d) {u v} = {x, y}. 


First suppose (b) holds; then u = x = y. Then (c) and (d) are equivalent, and 
tell us that u = v = x = y. In this case the conclusion of the theorem holds. 
Similarly if (c) holds, we have the same situation. 

There remains the case in which (a) and (d) hold. From (a) we have 
u = x. From (d) we get either u = y or v = y. In the first case (b) holds; 
that case has already been considered. In the second case we have v = y, 
as desired. 4 
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The preceding theorem lets us unambiguously define the first coordinate 
of <x, y> to be x, and the second coordinate to be y. 


Example Let R be the set of all real numbers. The pair <x, yò can be 
visualized as a point in the plane (Fig. 6), where coordinate axes have been 
established. This representation of points in the plane is attributed to 
Descartes. 


Now suppose that we have two sets A and B, and we form ordered 
pairs <x, yò with xe A and ye B. The collection of all such pairs is called 
the Cartesian product A x B of A and B: 


Ax B={<x, yy |xeA& ye B}. 


Fig. 6. The pair <x, yò as a point in the plane. 


We must verify that this collection is actually a set before the 
definition is legal. When we use the abstraction notation {t | — t _} for 
a set, we must verify that there does indeed exist a set D such that 


teD => _t__ 
for every t. For example, just as strings of symbols, the expressions 
{x|x=x} and = {x|x#x} 


look similar. But by Theorem 2A the first does not name a set, whereas 
(by the empty set axiom) the second does. 

The strategy we follow in order to show that A x B is a set (and not 
a proper class) runs as follows. If we can find a large set that already 
contains all of the pairs <x, y> we want, then we can use a subset 
axiom to cut things down to A x B. A suitable large set to start with 
is provided by the next lemma. 


Lemma 3B If xeC and yeC, then <x, pe PPC. 


Proof As the following calculation demonstrates, the fact that the 
braces in “{{x}, {x, y}}” are nested to a depth of 2 is responsible for the 
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two applications of the power set operation: 
xeC and ye, 
{x} SC and {x, y} SC, 
{xs}ePC and {x, ype PC, 
{x} (x WB} S PC, 
{{x}, {x yh E PAC. 


Corollary 3C For any sets A and B, there is a set whose members are 
exactly the pairs <x, yò with x € A and ye B. 


Proof From a subset axiom we can construct 
{we PP(A L B) |w = <x, yò for some x in A and some y in B}. 


Clearly this set contains only pairs of the desired sort; by the preceding 
lemma it contains them all. 4 


This corollary justifies our earlier definition of the Cartesian product 
AxB. 


As you have probably observed, our decision to use the Kuratowski 
definition 
<x, y> = fix} yh 


is somewhat arbitrary. There are other definitions that would serve as well. 
The essential fact is that satisfactory ways exist of defining ordered pairs in 
terms of other concepts of set theory. 


Exercises 
See also the Review Exercises at the end of this chapter. 


1. Suppose that we attempted to generalize the Kuratowski definitions of 
ordered pairs to ordered triples by defining 


<x, y, 2>* = {Ox}, {x y) be ys Zh. 


Show that this definition is unsuccessful by giving examples of objects 
u, v, w, X, y, Z with <x, y, z>* = <u, v, w>* but with either y #v or 
z # w (or both). 
2. a) Show that A x (BU C) = (A x B) ù (A x C). 

(b) Show that if A x B= Ax C and A#@, then B = C. 
3. Show that A x (JB = {A x X| Xe F}. 


4, Show that there is no set to which every ordered pair belongs. 
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5. (a) Assume that A and B are given sets, and show that there exists a 
set C such that for any y, 


yeC < y={x}xB for some x in A. 


In other words, show that {{x} x B|x € A} is a set. 
(b) With A, B, and C as above, show that A x B=\JC. 


RELATIONS 


Before attempting to say what, in general, a relation is, it would be prudent 
to contemplate a few examples. 
The ordering relation < on the set {2, 3, 5}, is one example. We might 


< 
Yo 


Fig. 7. The ordering relation < on {2, 3, 5}. 


2 


say that < relates each number to each of the larger numbers. Thus 
3 <5, so < relates 3 to 5. Pictorially we can represent this by drawing 
an arrow from 3 to 5. Altogether we get three arrows in this way 
(Fig. 7). 

What set adequately encodes this ordering relation? In place of the 
arrows, we take the ordered pairs <2, 3>, <2, 5>, and <3, 5>. The set of 
these pairs 


R = {<2, 3, <2, 5>, <3, 5>} 


completely captures the information in Fig. 7. At one time it was fashionable 
to refer to the set R as the graph of the relation, a terminology that seems 
particularly appropriate if we think of R as a subset of the coordinate plane. 
But nowadays an even simpler viewpoint has become dominant: R is the 
ordering relation on {2, 3, 5}. It consists of the pairs tying each number to the 
larger numbers; a relation is this collection of “ties.” 

A homier example might be the relation of marriage. (We ignore for 
the moment the fact that we banished people from our set theory.) This 
relation is the aggregate total of individual ties between each married 
person and his or her spouse. Or to say it more mathematically, the 
relation is . . 

{<x, y> | x is married to y}. 
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You have probably guessed that for us a relation will be a set of 
ordered pairs. And there will be no further restrictions; any set of ordered 
pairs is some relation, even if a peculiar one. 


Definition A relation is a set of ordered pairs. 


For a relation R, we sometimes write xRy in place of <x, yò € R. For 
example, in the case of the ordering relation < on the set R of real numbers 


< ={<x, yyeR x R |x is less than y}, 
the notation “x < y” is preferred to “<x, y € <.” 


Examples Let w be the set {0, 1, 2, ...}, which is introduced more formally 
in the next chapter. Then the divisibility relation is 


{<m, n> e w x @| (Ape w)m- p =n}. 
The identity relation on w is 
1, = {<n, n> | n€ a}. 


And any subset of w x œ (of which there are a great many) is some sort of 
relation. 


Of course some relations are much more interesting than others. In the 
coming pages we shall look at functions, equivalence relations, and ordering 
relations. At this point we make some very general definitions. 


Definition We define the domain of R (dom R), the range of R (ran R), 
and the field of R (fld R) by l 


xedomR < Jy<x,peR, 
<> Ft<t,xER, 
fidR = domRvuranR. 


For example, let R be the set of all real numbers (a set we construct 
officially in Chapter 5) and suppose that R € R x R. Then R is a subset of 
the coordinate plane (Fig. 8). The projection of R onto the horizontal axis 
is dom R, and the projection onto the vertical axis is ran R. 


xeranR 


To justify the foregoing definition, we must be sure that for any given 
R, there exists a set containing all first coordinates and second coordinates 
of pairs of R. The problem here is analogous to the recent problem of 
justifying the definition of A x B, which was accomplished by Corollary — 
3C. The crucial fact needed now is that there exists a large set already 
containing all of the elements we seek. This fact is provided by the 
following lemma, which is related to Lemma 3B. (Lemma 3D was stated 
as an example in the preceding chapter.) 
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Lemma 3D If <x, y> € A, then x and y belong to | J|] A. 


Proof We assume that {{x}, {x, y} e A. Consequently, {x, y} e (JA 
since it belongs to a member of A. And from this we conclude that 


xe (JUA and ye JUA. 4 


This lemma indicates how we can use subset axioms to construct the 
domain and range of R: 


dom R = {x e [JUR | 3y <x, y> € R}, 
ran R = {x e [JUR | 3t ¿t, x> € R}. 


ran R 


dom R 


Fig. 8. A relation as a subset of the plane. 


Exercises 


6. Show that a set A is a relation iff A S dom A x ran A. 
7. Show that if R is a relation, then fld R = | )|)R. 
8. Show that for any set £: 
dom |]. = (){dom R | Re 2}, 
ran |] = | {ran R | Re 8}. 


9. Discuss the result of replacing the union operation by the intersection 
operation in the preceding problem. 


n-ARY RELATIONS 


We can extend the ideas behind ordered pairs to the case of ordered 
triples and, more generally, to ordered n-tuples. For triples we define 


«x, ys z» = C<x, ys Zz». 
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Similarly we can form ordered quadruples: 


KX s X25 X3 X4? = LLX; > X3, X3), X4> 
= CKCXy > Xs X3) X42- 


Clearly we could continue in this way to define ordered quintuples or ordered 
n-tuples for any particular n. It is convenient for reasons of uniformity to 
define also the 1-tuple <x} = x. 

We define an n-ary relation on A to be a set of ordered n-tuples with all 
components in A. Thus a binary (2-ary) relation on A is just a subset of 
A x A. And a ternary (3-ary) relation on A is a subset of (A x A) x A. 
There is, however, a terminological quirk here. If n> 1, then any n-ary 
relation on A is actually a relation. But a unary (1-ary) relation on A is just 
a subset of A; thus it may not be a relation at all. 


Exercise 


10. Show that an ordered 4-tuple is also an ordered m-tuple for every 
positive integer m less than 4. 


FUNCTIONS 


Calculus books often describe a function as a rule that assigns to each 
object in a certain set (its domain) a unique object in a possibly different 
set (its range). A typical example is the squaring function, which assigns to 
each real number x its square x°. The action of this function on a 
particular number can be described by writing 


39, —2r4, 11, 44, ete. 
Each individual action can be represented by an ordered pair: 


«3, 9», «-2, 4», <1, 1), <, > etc. 


The set of all these pairs (one for each real number) adequately represents 
the squaring function. The set of pairs has at times been called the graph 
of the function; it is a subset of the coordinate plane R x R. But the 
simplest procedure is to take this set of ordered pairs to be the function. 

Thus a function is a set of ordered pairs (i.e., a relation). But it has a 
special property: It is “single-valued,” i.e., for each x in its domain there 
is a unique y such that xr» y. We build these ideas into the following 
definition. 


Definition A function is a relation F such that for each x in dom F there 
is only one y such that xFy. 
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For a function F and a point x in dom F, the unique y such that xFy 
is called the value of F at x and is denoted F(x). Thus <x, F(x) € F. 
The “F(x)” notation was introduced by Euler in the 1700s. We hereby 
resolve to use this notation only when F is a function and xe dom F. 
There are, however, some artificial ways of defining F(x) that are meaningful 
for any F and x. For example, the set 


Uty | <x y> € F} 


is equal to F(x) whenever F is a function and x € dom F. 

Functions are basic objects appearing in all parts of mathematics.’ As a 
result, there is a good deal of terminology used in connection with functions. 
Unfortunately, no terminology has become uniformly standardized. We 
collect below some of this terminology. 

We say that F is a function from A into B or that F maps A into B 
(written F: A > B) iff F is a function, dom F = A, and ran.F S B. Note 
the unequal treatment of A and B here; we demand only that ran F £ B. 
If, in addition, ran F = B, then F is a function from A onto B. (Thus any 
function F maps its domain onto its range. And it maps its domain into 
any set B that includes ran F. The applicability of the word “onto” depends 
both on F and on the set B, not just on F. The word “onto” must never 
be used as an adjective.) 

A function F is one-to-one iff for each y e ran F there is only one x such 
that xFy. For example, the function defined by 


f(x) =x? for each real number x 


is one-to-one, whereas the squaring function is not, since (— 3)? = 3”. One- 
to-one functions are sometimes called injections. 

It will occasionally be useful to apply the concept of “one-to-one” to 
relations that are not functions. Since the phrase “one-to-one” seems 
inappropriate in such cases, we will use the phrase “single-rooted,” in analogy 
to “single-valued.” 


Definition A set R is single-rooted iff for each ye ran R there is only 
one x such that xRy. 


Thus for a function, it is single-rooted iff it is one-to-one. 

It is entirely possible to have the domain of a function F consist of 
ordered pairs or n-tuples. For example, addition is a function +: R x R > R. 
Thus the domain of addition consists of pairs of numbers, and the addition 


' Despite this ubiquity, the general concept of “a function” emerged slowly over a period 
of time. There was a reluctance to separate the concept of a function itself from the ide- 
of a written formula defining the function. There still is. 
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function itself consists of triples of numbers. In place of +(<x, y>) we 
write either +(x, y) or x + y. 

The following operations are most commonly applied to functions, 
sometimes are applied to relations, but can actually be defined for arbitrary 
sets A, F, and G. 

Definition (a) The inverse of F is the set 

Fo} = {<u, v) | vFu}. 
(b) The composition of F and G is the set 
F o G = {Xu, vò | at(uGt & tFv)}. 
(c) The restriction of F to A is the set 
Fl A = {<u, o> | uFv & ue A}. 
(d) The image of A under F is the set 
F[A] = ran(F [ A) 
= {v | (Bu e A)uFv}. 


F[A] can be characterized more simply when F is a function and 
A S dom F; in this case 


F[A] = {F(u) | u € A}. 
In each case we can easily apply a subset axiom to establish the 
existence of the desired set. Specifically, F7! < ran F x dom F, Fo G S 
dom G x ran F, F | A S F,and F[A] £ ran F. (A more detailed justifica- 


tion of the definition of F~' would go as follows: By a subset axiom 
there is a set B such that for any x, 


xeB < xeran F x dom F & Iu w(x = <u, v) & vFu). 
It then follows that 
xeB < Fu w(x = <u, vy & vFu). 


This unique set B we denote as F~'.) 


Example Let F: R — R be defined by the equation F(x) = x?. Let A be 
the set {xe R|—-1<x <2}, ie, the closed interval [—1, 2]. Then 
FA] = [0,4]; see Fig.9. And F~*[.4] = [—./2, \/2]. Notice that although F 
here is a function, F~! is not a function, because both <9, 3> and <9, —3)> 
are in F~!. The so-called “multiple-valued functions” are relations, not 
functions. People write “F~1(9) = +3,” but it would be preferable to write: 


F~*[{9}] = {—3, 3}. 
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FLA] 


A 
Fig. 9. F[A] is the image of the set A under F. 


Example In mathematical analysis it is often necessary to consider the 
“inverse image” of a set A under a function F, i.e., the set F ~1T A]. For a 
function F, 


F~'fA] = {x e dom F | F(x)€ A} 
(Exercise 24). In general, F~* will not be a function. 


Example Let g be the sine function of trigonometry. Then g`! is nota 


function. (Why not?) But the restriction of g to the closed interval 
[— 7/2, 7/2] is one-to-one, and its inverse 


(g | [—2/2, n/2])~* 
is the arc sine function. 


Example Assume that we have the set of all people (!) in mind, and we 
define P to be the relation of parenthood, i.e., 


P = {<x, yò | x is a parent of y}. 
Then 
P! = {<x, yò | x is ason or daughter of y} 
and 
Po P = {<x, y) | x is a grandparent of y}. 
If A is the set of people born in Poland, then 


(P o P)[A] = {t | a grandparent of t was born in Poland}. 
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To further complicate matters, let S be the relation that holds between 
siblings: 
S = {<x, y) | x is a brother or sister of y} 
Then S~! = S. To find out what P o S is, we can calculate: 
(x,yy€PoS <> xSt&tPy forsomet 

<> «x isa sibling of a parent of y 

<> xis an aunt or uncle of y. 
None of the relations in this example are functions. 


Example Let 


F = KØ, a>, {DO}, b»). 


Observe that F is a function. We have 


F`! = {la, Ø>, <b, {DP}. 
F-! is a function iff a # b. The restriction of F to Ø is Ø, but 


F | {0} = KØ, ay}. 
Consequently, F[{@}] = {a}, in contrast to the fact that F ({@}) =b. 


The following facts about inverses are not difficult to show; the proofs of 
some of them are left as exercises. 


Theorem 3E Fora set F,dom F~! = ran F and ran F7! = dom F. For 
a relation F, (F7)! = F. 


Theorem 3F For a set F, F~! is a function iff F is single-rooted. A 
relation F is a function iff F~ + is single-rooted. 


Theorem 3G Assume that F is a one-to-one function. If xe dom F, 
then F~'(F(x)) = x. If ye ran F, then F(F~*(y)) = y. 


Proof Suppose that x e dom F; then <x, F(x)> € F and (F(x), x) € F~ L 
Thus F(x) e dom F~!. F7! is a function by Theorem 3F, so x = F~'(F(x)). 
If ye ran F, then by applying the first part of the theorem to F -l we 
obtain the equation (F~!)~1(F~'(y)) = y. But (F7*)7! = F. 4 


In place of Theorem 3G, we could have defined F~* (for a one-to-one 
function F) to be that function whose value at F(x) is x (and whose domain 
is ran F). But this would be too restrictive; F~' can be a useful relation even 
when it is not a function. Hence we prefer a definition of F~! that is 
applicable to any set. 
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Theorem 3H Assume that F and G are functions. Then FG is a 
function, its domain is 


l {x e dom G | G(x) e dom F}, 
and for x in its domain, (F ° G)(x) = F(G(x)). 

Proof To see that FoG is a function, assume that x(F o G)y and 

x(F o G)y’. Then for some t and t’, 

xGt & tFy and xGt' & Fy’. 
Since G is a function, t = ¢’. Since F is a function, y = y’. Hence F è G is a 
function. 

Now suppose that x e dom G and G(x)e dom F. We must show that 
xe dom(F o G) and that (F - G)(x) = F(G(x)). We have <x, G(x)> € G and 
<G(x), F(G(x))> e F. Hence <x, F(G(x))> € F - G, and this yields the desired 
facts. 


Conversely, if x ¢ dom F o G, then we know that for some y and t, 
xGt and tFy. Hence x e dom G and t = G(x) € dom F. 4 


Again we could have defined F 0 G (for functions F and G) as the function 
with the properties stated in the above theorem. But we prefer to use a 
definition applicable to nonfunctions as well. For example, in Exercise 32, 
we will want Ro R for an arbitrary relation R. 


Example Assume that G is some one-to-one function. Then by Theorem 
3H, GT! © Gis a function, its domain is 


{x e dom G | G(x) € dom G7 '} = dom G, 
and for x in its domain, 
(G~* o G)(x) = G*(G(x)) 
=x by Theorem 3G. 


Thus G`! oG is I4omg, the identity function on dom G, by Exercise 11. 
Similarly one can show that Go G7 t is Lan g (Exercise 25). 


Theorem 3I For any sets F and G, 
(FoG)'=G'oF?, 
Proof Both (F o G)~! and G7! o F~! are relations. We calculate: 
<x, yje (FoG)! = <y,xpeFoG 
yGt & tFx for some t 


xF~'t&tG ty  forsomet 
<x,yyeGicF 7}. 4 


$ $ 4 
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In a less abstract form, Theorem 3I expresses common knowledge. In 
getting dressed, one first puts on socks and then shoes. But in the inverse 
process of getting undressed, one first removes shoes and then socks. 


Theorem 3J Assume that F: A > B, and that A is nonempty. 


(a) There exists a function G: B— A (a “left inverse”) such that Go F 
is the identity function J, on A iff F is one-to-one. 

(b) There exists a function H: B > A (a “right inverse”) such that F o H 
is the identity function I, on B iff F maps A onto B. 


Proof (a) First assume that there is a function G for which Go F = I 4- 
If F(x) = F(y), then by applying G to both sides of the equation we have 


x = G(F(x)) = G(F(y)) = y, 


and hence F is one-to-one. 

For the converse, assume that F is one-to-one. Then F~* is a function 
from ran F onto A (by Theorems 3E and 3F). The idea is to extend F~' toa 
function G defined on all of B. By assumption A is nonempty, so we can 
fix some a in 4. Then we define G so that it assigns a to every point in 
B — ran F: 


F(x) if xeran F 
a if xe B-— ran F. 


In one line, 
G = F™' u (B — ran F) x {a} 


(see Fig. 10a). This choice for G does what we want: G is a function mapping 
B into A, dom(G o F) = A, and G(F(x)) = F~*(F(x)) = x for each x in A. 
Hence Go F=I,. 

(b) Next assume that there is a function H for which F > H =I,. Then 
for any y in B we have y = F(H(y)), so that y € ran F. Thus ran F is all of B. 

The converse poses a difficulty. We cannot take H = F~’, because in 
general F will not be one-to-one and so F~' will not be a function. Assume 
that F maps A onto B, so that ran F = B. The idea is that for each ye B 
we must choose some x for which F(x) = y and then let H(y) be the chosen 
x. Since y € ran F we know that such x’s exist, so there is no problem (see 
Fig. 10b). 

Or is there? For any one y we know there exists an appropriate x. 
But that is not by itself enough to let us form a function H. We have in 
general no way of defining any one particular choice of x. What is needed 
here is the axiom of choice. 


Functions 49 


B — ran F 


Í Oooo nme 


(a) 


(b) 


Fig. 10. The proof of Theorem 3J. In part (a), make G(x) = a for xe B — ran F. In part 
(b), H(y) is the chosen x for which F(x) = y. 


Axiom of Choice (first form) For any relation R there is a function 
H € R with dom H = dom R. 


With this axiom we can now proceed with the proof of Theorem 3J(b); 
take H to be a function with H ¢ F7! and dom H = dom F™' = B. Then 
H does what we want: Given any y in B, we have <y, H(y)> e F~!; hence 
<H(y), y) € F, and so F(H(y)) = y. 1 


In Chapter 6 we will give a systematic discussion of the axiom of choice. 
It is the only axiom that we discuss without using the marginal stripe. 
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Theorem 3K The following hold for any sets. (F need not be a function.) 
(a) The image of a union is the union of the images: 
FLA UBJ=FIA]v FIB] and Ff] = Uf FLA] A E 4}. 


(b) The image of an intersection is included in the intersection of the 
images: 


F[An BJS FIA] FIB] and — FI) ] S {FIAI Ae #3 


for nonempty .. Equality holds if F is single-rooted. 
(c) The image of a difference includes the difference of the images: 


F[A] — F[B]¢ FIA — B}. 
Equality holds if F is single-rooted. 


Example Let F: R—R be defined by F(x) = x7. Let A and B be the 
closed intervals [— 2, 0] and [1, 2]: 


A={x|-2<x<0} and B={x|l<x <2}. 


Then F[A] = [0, 4] and F[B] = [1, 4]. This example shows that equality 
does not always hold in parts (b) and (c) of Theorem 3K, for F[A ^ B] = 
F[@] = Ø, whereas F[A]- F[B]=[1, 4]. And F[A]— FĮB] = [0, 1), 
whereas F[A — B] = F[A] = [0, 4]. 


Proof To prove Theorem 3K we calculate 


yeF[Au B] + (axe Av B)xFy 
<> (Axe A)xFy or (3x e B)xFy 
< yeéF[A]orye F[B]. 


This proves the first half of (a). For intersections we have the corresponding 
calculation, except that the middle step 


(axe Ac B)xFy = (axe A)xFy & (ax € B)xFy 


is not always reversible. It is possible that both x, € A with x, Fy and x, € B 
with x, Fy, and yet there might be no x in A A B with xFy. But if F is single- 
rooted, then x, = x, and so it is in A ^ B. Thus we obtain the first half 
of (b). 

The second halves of (a) and (b) generalize the first halves. The proofs 
follow the same outlines as the first halves, but we leave the details to . 
Exercise 26. 
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For part (c) we also calculate: 
ye F[A]— F[B] < (axe A)xFy & 4 (3t e B)tFy 
=> (axe A — B)xFy 
< yeF[A—B]. 
Again if F is single-rooted, then there is only one x such that xFy. In this 
case the middle step can be reversed. 4 


Since the inverse of a function is always single-rooted, we have as an 
immediate consequence of Theorem 3K that unions, intersections, and 
relative complements are always preserved under inverse images. 


Corollary 3L For any function G and sets A, B, and £: 
GHU] = UGTA] | AE #4}, 
GE] = {C A] Ae A} for £ +Ø, 
G` Ż[A — B] = G"'[A]— G~+'[B]. 

We conclude our discussion of functions with some definitions that may 
be useful later. Our intent is to build a large working vocabulary of set- 
theoretic notations. 

An infinite union is often “indexed,” as when we write Ui ár We 
can give a formal definition to such a union as follows. Let I be a set, 


called the index set. Let F be a function whose domain includes J. Then we 
define 


UF = UFO lied 


= {x | x € F(i) for some i in J}. 
For example, if I = {0, 1, 2, 3}, then 
U F(i) = U{F(), F(1), F2), F63) 


iel 


= F(0) o F(1) o F(2) o F(3). 
Similar remarks apply to intersections (provided that J is nonempty): 


N F(i) = MFG lie B 
iel 
= {x | x e F(i) for every i in I}. 
If we use the alternative notation 


F = F(i), 
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then we can rewrite the above equations as 
U F, = U{F, lied 
iel 
= {x | xe F, for some i in J} 
and 
NF, = ACF, lien 
iel 
= {x | xe F, for every i in J}. 
For sets A and B we can form the collection of functions F from A into B. 
Call the set of all such functions 4B: 


4B = {F | F is a function from A into B}. 


If F: A > B, then F S A x B, and so Fe P(A x B). Consequently we 
can apply a subset axiom to P(A x B) to construct the set of all functions 
from A into B. 


The notation ^B is read “ B-pre-A.” Some authors write B^ instead; this 
notation is derived from the fact that if A and B are finite sets and the 
number of elements in A and B is a and b, respectively, then 4B has b° 
members. (To see this, note that for each of the a elements of A, we can 
choose among b points in B into which it could be mapped. The number 
of ways of making all a such choices is b- b- +--+ b, a times.) We will return 
to this point in Chapter 6. 


Example Let œ= {0, 1, 2, ...}. Then °{0, 1} is the set of all possible 
functions f: œ — {0, 1}. Such an f can be thought of as an infinite sequence 
f0) f (1), (2), -.. of O's and 1’s. 


Example For a nonempty set A, we have 4@ = Ø. This is because no 
function could have a nonempty domain and an empty range. On the other 
hand, 24 = {Ø} for any set A, because Ø: Ø> A, but Ø is the only 
function with empty domain. As a special case, we have °@ = {@}. 


Exercises 


“M1. Prove the following version (for functions) of the extensionality 
principle: Assume that F and G are functions, dom F = dom G, and 
F(x) = G(x) for all x in the common domain. Then F = G. 


12. Assume that f and g are functions and show that 
fog = domfsdomg & (Vx e dom f) f(x) = g(x). 
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13. Assume that fand g are functions with f € g and dom g S dom f. Show 
that f= g. 
14. Assume that f and g are functions. 
(a) Show that f ^g is a function. 
(b) Show that fU g is a function iff f(x) = g(x) for every x in 
(dom f) ^a (dom g). 
15. Let. bea set of functions such that for any f and g in s, either 
fE g or g S f. Show that |] is a function. 
16. Show that there is no set to which every function belongs. 
17. Show that the composition of two single-rooted sets is again single- 
rooted. Conclude that the composition of two one-to-one functions is again 
one-to-one. 


18. Let R be the set 


{<0, 1>, <0, 2>, <0, 3>, <1, 2), <1, 3>, <2, 3}. 
Evaluate the following: R o R, R | {1}, R7! t {1}, RI{1}}, and R7*[{1}].- 
19. Let 
A= KØ, {Ds D LD D}. 
Evaluate each of the following: A(@), ALO] ALOL AUS, (ØY 47, 
Ao A, ATS, A TØ} A 1 {D, {3}, UUA. 
20. Show that F | A= Fo (A x ran F). 
21. Show that (R ° S)o T =R o (S o T) for any sets R, S, and T. 
22. Show that the following are correct for any sets. 
(a) AS B= FA]S F[B]. 
(b) (F° GĦA] = FIG[A]]. 
(c) QI (Av B)=(QT A)v (QT B). . 
23. Let J, be the identity function on the set A. Show that for any sets 
B and C, 
BoI,=BlA and IJCJ=ANC. 


24. Show that for a function F, F~'[A] = {x e dom F | F(x) € A}. 

25. (a) Assume that G is a one-to-one function. Show that Go G~' is 
Lan g» the identity function on ran G. 
(b) Show that the result of part (a) holds for any function G, not 
necessarily one-to-one. 

26. Prove the second halves of parts (a) and (b) of Theorem 3K. 


27. Show that dom(F o G) = G` '[dom F] for any sets F and G. (F and G 
need not be functions.) 
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28. Assume that f ‘is a one-to-one function from A into B, and that G is 
the function with dom G = PA defined by the equation G(X) = f [X]. Show 
that G maps PA one-to-one into PB. 


29. Assume that f: A > B and define a function G: B > PA by 
G(b) = {x € A | f(x) = b}. 

Show that iffmaps A onto B, then G is one-to-one. Does the converse hold? 
30. Assume that F: PA > PA and that F has the monotonicity property: 
XSYSA => F(X)S F(Y). 

Define 
B=(\{XCA|F(X)EX} and C=U{XS A|X = F(X). 


(a) Show that F(B) = B and F(C) = C. 
(b) Show that if F(X) = X, then BS XE C. 


INFINITE CARTESIAN PRODUCTS? 


We can form something like the Cartesian product of infinitely many 
sets, provided that the sets are suitably indexed. More specifically, let I be 
a set (which we will refer to as the index set) and let H be a function whose 
domain includes I. Then for each i in J we have the set H(i); we want the 
product of the H(i)’s for all ie I. We define: 


X A) = {f |f is a function with domain I and (Vie I) f (i) € H(i}. 


Thus the members of X, , H(i) are “J-tuples” (i.e., functions with domain 7) 
for which the “ith coordinate” (i.e., the value at i) is in H(i). 


The members of X, ., H(i) are all functions from J into ();e; H(i) 
and hence are members of *(\),,, H(i)). Thus the set Xie; H(i) can be 
formed by applying a subset axiom to "(\),., H(i)). 


Example If for every ie I we have H(i) = A for some one fixed A, then 
X,., Hi) =". 


Example Assume that the index set is the set œ = {0, 1, 2, ...}. Then 
X; <a H(i) consists of “w-sequences” (ie. functions with domain œ) that 
have for their ith term some member of H(i). If we picture the sets H(i) as 
shown in Fig. 11, then a typical member of Xico H(i) is a “thread” that 
selects a point from each set. 


2 This section may be omitted if certain obvious adjustments are made in Theorem 6M. 
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If any one H(i) is empty, then clearly the product X,_, H(i) is empty. 
Conversely, suppose that H(i) A @ for every i in J. Does it follow that 
X,-, H(i) # Ø? To obtain a member f of the product, we need to select 
some member from each H(i), and put f (i) equal to that selected member. 
This requires the axiom of choice, and in fact this is one of the many 


equivalent ways of stating the axiom. 


Axiom of Choice (second form) For any set J and any function H 
with domain J, if H(i) # Ø for alli in J, then X,_, H(i) # Ø. 


H(0) H(i) H(2) H(3) H(4) 
Fig. 11. The thread is a member of the Cartesian product. 


Exercise 


31. Show that from the first form of the axiom of choice we can prove 
the second form, and conversely. i 


EQUIVALENCE RELATIONS 


Consider a set A (Fig. 12a). We might want to partition A into little 
boxes (Fig. 12b). For example, take A = œw; we can partition œ into six 
parts: 


{0, 6, 12, ...}, 
{1, 7, 13,..3, 


{5, 11, 17, ...}. 


By “partition” we mean that every element of A is in exactly one little box, 
and that each box is a nonempty subset of A. 

Now we need some mental agility. We want to think of each little box 
as being a single object, instead of thinking of it as a plurality of objects. 
(Actually we have been doing this sort of thing throughout the book, when- 
ever we think of a set as a single object. It is really no harder than 
thinking of a brick house as a single object and not as a multitude of 
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bricks.) This changes the picture (Fig. 12c); each box is now, in our mind, 
a single point. The set B of boxes is very different from the set A. In our 
example, B has only six members whereas A is infinite. (When we get around 
to defining “six” and “infinite” officially, we must certainly do it in a way 
that makes the preceding sentence true.) 

The process of transforming a situation like Fig. 12a into Fig. 12c is 
common in abstract algebra and elsewhere in mathematics. And in Chapter 
5 the process will be applied several times in the construction of the real 
numbers. 

Suppose we now define a binary relation R on A as follows: For 
x and y in A, 

xRy <> x and yare in the same little box. 


e oè e 
e e oœ 
(a) (b) (c) 


Fig. 12. Partitioning a set into six little boxes. 


Then we can easily see that R has the following three properties. 


1. Ris reflexive on A, by which we mean that xRx for all x € A. 
2. R is symmetric, by which we mean that whenever xRy, then also 


yRx. l 
3. R is transitive, by which we mean that whenever xRy and yRz, 
then also xRz. 


Definition R is an equivalence relation on A iff R is a binary relation 
on A that is reflexive on A, symmetric, and transitive. 


Theorem 3M_ If R is a symmetric and transitive relation, then R is an 
equivalence relation on fid R. 
Proof Any relation R is a binary relation on its field, since 
R € dom R x ran R € fld R x fld R. 
What we must show is that R is reflexive on fid R. We have 
xedom R = xRy for some y 
=> xRy& yRx by symmetry 
=> xRx by transitivity, 


and a similar calculation applies to points in ran R. 
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This theorem deserves a precautionary note: If R is a symmetric and 
transitive relation on A, it does not follow that R is an equivalence relation 
on A. R is reflexive on fid R, but fld R may be a small subset of A. 

We have shown how a partition of a set A induces an equivalence 
relation. (A more formal version of this is in Exercise 37.) Next we want to 
reverse the process, and show that from any equivalence relation R on A, 
we get a partition of A. 


Definition The set [x], is defined by 


[x], = {t | xRt}. 
If R is an equivalence relation and x € fld R, then [x], is called the equivalence 


class of x (modulo R). If the relation R is fixed by the context, we may write 
just [x]. 


The status of [x], as a set is guaranteed by a subset axiom, since 
[x], S ran R. Furthermore we can construct a set of equivalence classes 
such as {[x], | x € A}, since this set is included in A(ran R). 


Lemma 3N Assume that R is an equivalence relation on A and that 
x and y belong to A. Then 


[x], = [D]e iff xRy- 


Proof (=>) Assume that [x], =[y],. We know that y € [y], (because 
yRy), consequently y € [x], (because [x], = [y],). By the definition of [x],, 
this means that xRy. 

(=) Next assume that xRy. Then 


tely], => yRt 


=> xRt because xRy and R is transitive 

=> te [x] R' 
Thus [y]; S [x],- Since R is symmetric, we also have yRx and we can 
reverse x and y in the above argument to obtain [x], S [y],- d 


Definition A partition TI of a set A is a set of nonempty subsets of A 
that is disjoint and exhaustive, i.e., 


(a) no two different sets in II have any common elements, and 
(b) each element of A is in some set in TI. 


Theorem 3P Assume that R is an equivalence relation on A. Then the 
set {[x], | x € A} of all equivalence classes is a partition of A. 


Proof Each equivalence class [x], is nonempty (because x e [x],) and is 
a subset of A (because R is a binary relation on A). The main thing that we 
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must prove is that the collection of equivalence classes is disjoint, i.e., part 
(a) of the above definition is satisfied. So suppose that [x], and [y], have a 
common element t. Thus 


xRt and yRt. 
But then xRy and by Lemma 3N, [x], = Dlg- 4 
If R is an equivalence relation on A, then we can define the quotient set 
AIR = {[x], |x € 4} 


whose members are the equivalence classes. (The expression A/R is read 
“A modulo R.”) We also have the natural map (or canonical map) 
g: A> A/R defined by 
(x) = [x], 
for x € A. 
Example Let w = {0, 1, 2, ...}; define the binary relation ~ on œ by 
m~n <> m- nis divisible by 6. 


Then ~ is an equivalence relation on œ (as you should verify). The ` 
quotient set w/~ has six members: 


fo], (1) E} B} [4 [5] 
corresponding to the six possible remainders after division by 6. 


Example The relation of congruence of triangles in the plane is an 
equivalence relation. ' 


Example Textbooks on linear algebra often define vectors in the plane 
as follows. Let A be the set of all directed line segments in the plane. Two 
such line segments are considered to be equivalent iff they have the same 
length and direction. A vector is then defined to be an equivalence class 
of directed line segments. But to avoid the necessity of dealing explicitly with 
equivalence relations, books use phrases like “equivalent vectors are regarded 
as equal even though they are located in different positions,” or “we write 
PQ = RS to say that PQ and RS have the same length and direction even 
though they are not identical sets of points,” or simply “we identify two line 
segments having the same length and direction.” 


Example Let F: A —> B and for points in A define 
x~y iff F(x) = F(y). 


The relation ~ is an equivalence relation on A. There is a unique one-to-one 
function F: A/~ > B such that F = F > ọ (where @ is the natural map as 
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Fig. 13. F factors into the natural map followed by a one-to-one function. 


shown in Fig. 13). The value of F at a particular equivalence class is the 
common value of F at the members of the equivalence class. 


The last problem we want to examine in this section is the problem of 
defining functions on a quotient set. Specifically, assume that R is an 
equivalence relation on A and that F: A > A. We ask whether or not there 
exists a corresponding function F: A/R > A/R such that for all x € A, 


F([x],) = [FO], 
(see Fig. 14). Here we are attempting to define the value of F at an 
equivalence class by selecting a particular member x from the class and then 
forming [F(x)],. But suppose x, and x, are in the same equivalence class. 
Then F is not well defined unless F(x,) and F(x,) are in the same 
equivalence class. 


A/R 


A A 
| Co) C iw) 
ee 


Fig. 14. This diagram is said to be commutative if Ê og = go F. 
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Example Consider w/~ where m ~ n iff m—n is divisible by 6. Three 
functions from @ into w are defined by 


F,(n)=2n,  F,(n)=n?,  F,(n) = 2”. 


In each case we can ask whether there is F poal~ > @/~ such that for every 
nin œ: 


P D= Fn) =f?) Fli) = 2") 


It is easy to see that if m ~ n, then 2m ~ 2n. Because of this fact Ê, is well 
defined; that is, there exists a function F, satisfying the equation 
F,({n)) = [2n]. No matter what representative m of the equivalence class [n] 
we look at, we always obtain the same equivalence class [2m]. (For further 
details, see the proof of Theorem 3Q below.) Similarly if m ~ n, then m? ~ n?, 
for recall that m? — n? = (m + n)(m — n). Consequently F, is also well 
defined. On the other hand, Ê 3 is not well defined. For example, 0 ~ 6 but 
2° = 14 64 = 26. Thus although [0] = [6], we have [2°] # [2°]. Hence there 
cannot possibly exist any function F, such that the equation F,(([n) = [2] 
holds for both n = 0 and n = 6. 


In order to formulate a general theorem here, let us say that F is 
compatible with R iff for all x and y in A, 


xRy => F(x)RF(y). 


Theorem 3Q Assume that R is an equivalence relation on A and that 
F: A> A. If F is compatible with R, then there exists a unique 
F: A/R > A/R such that 


(sx) F([x],) =[F@)p for all x in A. 


If F is not compatible with R, then no such F exists. Analogous results 
apply to functions from A x A into A. 


Proof First assume that F is not compatible; we will show that there 
can be no Ê satisfying (st) The incompatibility tells us that for certain x 
and y in A we have xRy (and hence [x] = [y]) but not F(x)RF(y) (and 
hence [F(x)] # [F(y)]). For (sx) to hold we would need both 


F(x) = [F] and Fy) = FO). 
But this is impossible, since the left sides coincide and the right sides differ. 
Now for the converse, assume that F is compatible with R. Since (57) 


demands that the pair <[x], [F(x)) € F, we will try defining F to be the set 
of all such ordered pairs: 


F = Kix] [F(x)) | x € 4. 
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The crucial matter is whether this relation F is a function. So consider 
pairs <[x], [F(x)]> and <[y], [F(y)]> in F. The calculation 


[x}=[y] = xRy by Lemma 3N 
= F(x)RF(y) by compatibility 
=> [F(x)] =[F(y)] by Lemma 3N 


shows that F is indeed a function. The remaining things to check are easier. 
Clearly dom Ê = A/R and ran F € A/R, hence ÔF: A/R > A/R. Finally (sz) 
holds because <[x], [F(x)]> e Ê. 

We leave it to you to explain why Ê is unique, and to formulate the 
“analogous results” for a binary operation (Exercise 42). 4 


Exercises 
32. (a) Show that R is symmetric iff R7! S R. 
(b) Show that R is transitive iff Re RS R. 
33. Show that R is a symmetric and transitive relation iff R= R`! o R. 


34, Assume that æ is a nonempty set, every member of which is a 
transitive relation. 

(a) Is the set ().o a transitive relation? 

(b) Is | |. a transitive relation? 


35. Show that for any R and x, we have [x], = R[{x}]. 


36. Assume that f: A —> B and that R is an equivalence relation on B. 
Define Q to be the set 


Kx, ye A x A] Cf (x), FO) ER} 


Show that Q is an equivalence relation on A. 
37. Assume that TI is a partition ofa set A. Define the relation Ry as follows: 


xRyy << (SBeTI)\(xeB& ye B). 


Show that Rp is an equivalence relation on A. (This is a formalized 
version of the discussion at the beginning of this section.) 


38. Theorem 3P shows that A/R is a partition of A whenever R is an 
equivalence relation on A. Show that if we start with the equivalence 
relation Ry of the preceding exercise, then the partition A/R, is just IT. 

39. Assume that we start with an equivalence relation R on A and define 


TI to be the partition A/R. Show that Ra, as defined in Exercise 37, is just 
R. 
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40. Define an equivalence relation R on the set P of positive integers by 
mRn <> mand n have the same number of prime factors. 


Is there a function f : P/R > P/R such that f ([n],) = [3n], for each n? 
41. Let R be the set of real numbers and define the relation Q on R x R 
by <u, »>Q<x, yy iffuty=xtov. 
(a) Show that Q is an equivalence relation on R x R. 
(b) Is there a function G: (R x R)/Q > (R x R)/Q satisfying the 
equation 


G([<x, y>]g) = [Kx + 2y, y + 2x]? 


42. State precisely the “analogous results” mentioned in Theorem 3Q. (This 
will require extending the concept of compatibility in a suitable way.) 


ORDERING RELATIONS 


The first example of a relation we gave in this chapter was the ordering 
relation 


{<2, 3), <2, 5), <3, 5>} 


on the set {2, 3, 5}; recall Fig. 7. Now we want to consider ordering 
relations on other sets. In the present section we will set forth the basic 
concepts, which will be useful in Chapter 5. A more thorough discussion 
of ordering relations can be found. in Chapter 7. 

Our first need is for a definition. What, in general, should it mean to 
say that R is an ordering relation on a set A? Well, for one thing R should 
tell us, given any distinct x and y in A, just which one is smaller. No x 
should be smaller than itself. And furthermore if x is less than y and y is 
less than z, then x should be less than z. The following definition captures 
these ideas. 


Definition Let A be any set. A linear ordering on A (also called a 
total ordering on A) is a binary relation R on A (i.e, R & A x A) meeting 
the following two conditions: 


(a) R is a transitive relation; i.e., whenever xRy and yRz, then xRz. 
(b) R satisfies trichotomy on A, by which we mean that for any x and y 
in A exactly one of the three alternatives 
xRy, x=y, yRx 
holds. 
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To clarify the meaning of trichotomy, consider first the special case 
where x and y are the same member of A (with two names). Then 
trichotomy demands that exactly one of 


xRx, X= X, xRx 


holds. Since the middle alternative certainly holds, we can conclude that 
xRx never holds. 

Next consider the case where x and y are two distinct members of A. 
Then the middle alternative x = y fails, so trichotomy demands that either 
xRy or yRx (but not both). Thus we have proved the following: 


Theorem 3R Let R be a linear ordering on A. 


(i) There is no x for which xRx. 
(ii) For distinct x and y in A, either xRy or yRx. 


In fact for a transitive relation R on A, conditions (i) and (ii) are 
equivalent to trichotomy. A relation meeting condition (i) is called irreflexive; 
one meeting condition (ii) is said to be connected on A. 

Note also that a linear ordering R can never lead us in circles, e.g., 
there cannot exist a circle such as 


x,Rx,, x, Rx, x4 Rx,, x, Rx,, x, Rx. 


This is because if we had such a circle, then by transitivity x,Rx,, 
contradicting part (i) of the foregoing theorem. 

Of course “R” is not our favorite symbol for a linear ordering; our 
favorite is “<.” For then we can write “x < y” to mean that the pair 
<x, y> is a member of the set <. 

If < is a linear ordering on A and if A is not too large, then we can 
draw a picture of the ordering. We represent the members of A by dots, 
placing the dot for x below the dot for y whenever x < y. Then we add 
vertical lines to connect the dots. The resulting picture has the points of A 
stretched out along a line, in the correct order. (The adjective “linear” 
reflects the possibility of drawing this picture.) 

Figure 15 contains three such pictures. Part (a) is the picture for the 
usual order on {2, 3, 5}. Parts (b) and (c) portray the usual order on the 
natural numbers and on the integers, respectively. (Infinite pictures are more 
difficult to draw than finite pictures.) 

In addition to the concept of linear ordering, there is the more general 
concept of a partial ordering. Partial orderings are discussed in the first 
section of Chapter 7. In fact you might want to read that section next, 
before going on to Chapter 4. At least look at Figs. 43 and 44 there, 
which contrast with Fig. 15. 
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3 
5 2 
5 4 1 
3 0 
3 2 -1 
1 -2 
2 0 -3 


(a) (b) (c) 


Fig. 15. Linear orderings look linear. 


Exercises 
43, Assume that R is a linear ordering on a set A. Show that R~* is also i 
a linear ordering on A. , 

44. Assume that < is a linear ordering on a set A. Assume that f: A> A 
and that f has the property that whenever x < y, then f(x) < f (y). Show 
that f is one-to-one and that whenever f(x) < f (y), then x < y. 

45. Assume that < ,and <, are linear orderings on A and B, respectively. 
Define the binary relation <, on the Cartesian product A x B by: 


<a, bi) <, <az, b) iff either a, < ,a, 01 (a, = a, & by < gba) 


' Show that <, is a linear ordering on A x B. (The relation <, is called 
lexicographic ordering, being the ordering used in making dictionaries.) 


Review Exercises 


46. Evaluate the following sets: 
a) NNAS y» 

(b) NANAK y» 

47. (a) Find all of the functions from {0, 1, 2} into {3, 4}. 
(b) Find all of the functions from {0, 1, 2} onto {3, 4, 5}. 
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48. Let T be the set {@, {@}}. 
(a) Find all of the ordered pairs, if any, in AT. 
(b) Evaluate and simplify: (PT)! > (AT ft {@}) 
49, Find as many equivalence relations as you can on the set {0, 1, 2}. 


50. (a) Find a linear ordering on {0, 1, 2, 3} that contains the ordered 
pairs <0, 3> and <2, 1). 
(b) Now find a different one meeting the same conditions. 
51. Find as many linear orderings as possible on the set {0, 1, 2} that 
contain the pair <2, 0>. 


52. Suppose that A x B = C x D. Under what conditions can we conclude 
that A = C and B = D? 
53. Show that for any sets R and S we have (R o S) + = R7! o S7}, 
(Ra S) t= R! a S7}, and (R — SY! = R™* — S~t. 
54. Prove that the following equations hold for any sets. 

(a) Ax(BaC)=(4xB)a(4xC). 

(b) Ax(BUC)=(Ax B)uU (Ax C). 

(c) Ax(B-—C)=(A x B)-(Ax C). 
55. Answer “yes” or “no.” Where the answer is negative, supply a 
counterexample. 

(a) Is it always true that (A x A) U (Bx C) = (A u B) x (AUC)? 

(b) Is it always true that (A x A) nm (B x C)=(AN B) x (ANC)? 
56. Answer “yes” or “no.” Where the answer is negative, supply a counter- 
example. 

(a) Is dom(R ù S) always the same as dom R ù dom S? 

(b) Is dom(R ^ S) always the same as dom R ^ dom S? 


57. Answer “yes” or “no.” Where the answer is negative, supply a 
counterexample. 

(a) Is Ro (Su T) always the same as (RS) o (Re T)? 

(b) Is Ro (Sc T) always the same as (Ro S) ^ (Ro T)? 
58. Give an example to show that F[F~ *[S]] is not always the same as S. 
59. Show that for any sets Qf (An B)=(QT A) (QT B) and 
Q Î (A -— B) = (Q Ì A) - (QT B). 
60. Prove that for any sets (R o S) Ì A = R 0 (S Î A). 


CHAPTER 4 


NATURAL NUMBERS 


There are, in general, two ways of introducing new objects for mathe- 
matical study: the axiomatic approach and the constructive approach. The 
axiomatic approach is the one we have used for sets. The concept of set 
is one of our primitive notions, and we have adopted a list of axioms 
dealing with the primitive notions. 

Now consider the matter of introducing the natural numbers’ 


0, 1, 2,... 


for further study. An axiomatic approach would consider “natural number” 
as a primitive notion and would adopt a list of axioms. Instead we will 
use the constructive approach for natural numbers. We will define natural 
numbers in terms of other available objects (sets, of course). In place of 
axioms for numbers we will be able to prove the necessary properties 
of numbers from known properties of sets. 


! There is a curious point of terminology here. Is 0 a natural number? With surprising 
consistency, the present usage is for school books (through high-school level) to exclude 0 
from the natural numbers, and for upper-division college-level books to include 0. Freshman 
and sophomore college books are in the transition zone. In this book we include 0 among 
the natural numbers. 
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Constructing the natural numbers in terms of sets is part of the process 
of “embedding mathematics in set theory.” The process will be continued 
in Chapter 5 to obtain more exotic numbers, such as J/2. 


INDUCTIVE SETS 


First we need to define natural numbers as suitable sets. Now numbers 
do not at first glance appear to be sets. Not that it is an easy matter to 
say what numbers do appear to be. They are abstract concepts, which are 
slippery things to handle. (See, for example, the section on “Two” in 
Chapter 5.) Nevertheless, we can construct specific sets that will serve 
perfectly well as numbers. In fact this can be done in a variety of ways. 
In 1908, Zermelo proposed to use 


D, {DP}, UD} --- 


as the natural numbers. Later von Neumann proposed an alternative, which 
has several advantages and has become standard. The guiding principle 
behind von Neumann’s construction is to make each natural number be the 
set of all smaller natural numbers. Thus we define the first four natural 
numbers as follows: 


og, 

1 = {0} = {Ø}, 

2 = {0, 1} = {0,8 

3 = {0, 1,2} = {S(O}, 1S. 1. 
We could continue in this way to define 17 or any chosen natural number. 
Notice, for example, that the set 3 has three members. It has been selected 
from the class of all three-member sets to represent the size of the sets in 
that class. 


This construction of the numbers as sets involves some extraneous 
properties that we did not originally expect. For example, 


Ocle2e3e--- 
and 
0c1e2c3¢c::: 


But these properties can be regarded as accidental side effects of the 

definition. They do no harm, and actually will be convenient at times. 
Although we have defined the first four natural numbers, we do not yet 

have a definition of what it means in general for something to be a natural 
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number. That is, we have not defined the set of all natural numbers. Such a 
definition cannot rely on linguistic devices such as three dots or phrases 
like “and so forth.” First we define some preliminary concepts. 


Definition For any set a, its successor a* is defined by 
at =a U {a}. 
A set A is said to be inductive iff @ € A and it is “closed under successor,” 
ie., 
(Vae A)at eA. 


In terms of the successor operation, the first few natural numbers can 
be characterized as 


0=2, 1=@*, 2=Ot*, 3=QGttt,.... 


These are all distinct, e.g., @* # @*** (Exercise 1). And although we have 
not yet given a formal definition of “infinite,” we can see informally that 
any inductive set will be infinite. 


We have as yet no axioms that provide for the existence of infinite 
sets. There are indeed infinitely many distinct sets whose existence we 
could establish. But there is no one set having infinitely many members 
that we can prove to exist. Consequently we cannot yet prove that any 
inductive set exists. We now correct that fault. 


Infinity Axiom There exists an inductive set: 
GAZ € A & (Yace A) a* € A}. 
Armed with this axiom, we can now define the concept of natural 
number. 
Definition A natural number is a set that belongs to every inductive set. 


We next prove that the collection of all natural numbers constitutes a 
set. 


t 


Theorem 4A There is a set whose members are exactly the natural 
numbers. 


Proof Let A be an inductive set; by the infinity axiom it is possible 
to find such a set. By a subset axiom there is a set w such that for any x, 


xew © xeA& x belongs to every other inductive set 
<> x belongs to every inductive set. 


(This proof is essentially the same as the proof of Theorem 2B.) 4 
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The set of all natural numbers is denoted by a lowercase Greek omega: 


xew < x isa natural number 


<> x belongs to every inductive set. 
In terms of classes, we have 
œ =()\{A | A is inductive}, 
but the class of all inductive sets is not a set. 
Theorem 4B œ is inductive, and is a subset of every other inductive set. 


Proof First of all, Ø € w because Ø belongs to every inductive set. 
And second, 


acw = a belongs to every inductive set 


= a` belongs to every inductive set 


=> atea. 


Hence w is inductive. And clearly œw is included in every other inductive set. 
4 


Since œ is inductive, we know that 0 (=Ø) is in œ. It then follows that 
1 (=0*) is in œ, as are 2 (=1*) and 3 (=2*). Thus 0, 1, 2, and 3 are 
natural numbers. Unnecessary extraneous objects have been excluded from 
œ, since w is the smallest inductive set. This fact can also be restated as 
follows. 


Induction Principle for œ Any inductive subset of œ coincides with œ. 


Suppose, for example, that we want to prove that for every natural 
number n, the statement _ n __ holds. We form the set 


T={néeow|__n_} 


of natural numbers for which the desired conclusion is true. If we can show 
that T is inductive, then the proof is complete. Such a proof is said to be a 
proof by induction. The next theorem gives a very simple example of this 
method. 


Theorem 4C Every natural number except 0 is the successor of some 
natural number. 


Proof Let T = {ne œ | either n = 0 or (ape œ) n= p*}. Then Oe T. 
And if ke T, then k* e T. Hence by induction, T = œ. 4 
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Exercise 
See also the Review Exercises at the end of this chapter. 
1. Show that 14 3, i.e., that Ø* #@tt*. 


PEANO’S POSTULATES? 


In 1889, Peano published a study giving an axiomatic approach to the 
natural numbers. He showed how the properties of natural numbers could 
be developed on the basis of a small number of axioms. Although he 


e S(e) e S(e) 
oe} eo 
Í \ ste N ste 
son Ae \ / EGAN OEN 
——j 


(a) (b) (c) 


Fig. 16. Any Peano system must behave like (c). 


attributed the formulation of the axioms to Dedekind, the axioms are 
generally known as “Peano’s postulates.” We will first show that the set w 
we have constructed satisfies Peano’s postulates, i.e., the “ postulates” become 
provable when applied to œ. Later we will prove that anything satisfying 
Peano’s postulates is, in a certain specific sense, “just like” œ. 

To formulate these results more accurately, we must define the concept 
of a Peano system. First of all, if S is a function and A is a subset of. 
dom S, then A is said to be closed under S iff whenever xe A, then 
S(x)e A. (This can equivalently be expressed as S[A]¢ A.) Define a 
Peano system to be a triple <N, S, e> consisting of a set N, a function 
S: N >N, and a member ee N such that the following three conditions 
are met: 


\ (i) e¢ranS. 
(ii) S is one-to-one. 
(iii) Any subset A of N that contains e and is closed under S equals N 
itself. 


The condition “e ¢ ran S” rules out loops as in Fig. 16a—here the 
arrows indicate the action of S. And the requirement that S be one-to-one 


2 The material in this section on Peano systems is not essential to our later work. But 
the material on transitive sets is essential. 
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rules out the system of Fig. 16b. Consequently any Peano system must, 
in part, look like Fig. 16c. The last of the three conditions is the Peano 
induction postulate. Its function is to rule out any points other than the 
ones we expect. We expect the system to contain e, S(e), SS(e), SSS(e), .... 
The Peano induction postulate replaces the three dots with a precise 
set-theoretic condition, stating that nothing smaller than N itself can contain 
e and be closed under S. 

First we want to show that w (with the successor operation and 0) 
is a Peano system. In particular, this will show that some Peano system 
exists. Let ø be the restriction of the successor operation to œ: 


o = {<n, n*> | nea. 
Theorem 4D <o, o, 0> is a Peano system. 


Proof Since w is inductive, we have Oe w and o: w> a. The Peano 
induction postulate, as applied to <q, o, 0), states that any subset A of œ 
containing 0 and closed under o equals œ itself. This is just the induction 
principle for œw. Clearly 0 ¢ ran a, since n* # Ø. It remains only to show 
that ø is one-to-one. For that purpose (and others) we will use the concept 
of a transitive set. 


Definition A set A is said to be a transitive set iff every member of a 
member of A is itself a member of A: 


xeacA => xEA. 


This condition can also be stated in any of the following (equivalent) 
ways: 


(JASA, 
acA => aSAM, 
ASÊPA. 


At this point we have violated a basic rule—we have defined “transitive” 
to mean two different things. In Chapter 3 we said that A is transitive if 
whenever xAy and yAz, then xAz. And now we define A to be a transitive 
set if a very different condition is met. But both usages of the word are 
well established. And so we will learn to live with the ambiguity. In practice, 
the context will make clear which sense of “transitive” is wanted. Further- 
more when the concept of Chapter 3 is meant, we will refer to A as a 
transitive relation (luckily A will be a relation), reserving the phrase 
“transitive set” for the concept defined above. 
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Example The set {@, {{@}}} is not a transitive set. This is because 


{PD} E UDI} € 1D, LOH}, 
but {2} E(B, HØ Also {0, 1, 5} is not a transitive set, since 
4e 5e {0, 1, 5} whereas 4 ¢ {0, 1, 5}. 


Theorem 4E For a transitive set a, 
Ulat) =a 
Proof We proceed to calculate | Ja: 
Ua* = Ula v {a} 
=| Jau | Jfa} by Exercise 21 of Chapter 2 
=|(Javua 
=a. 
The last step is justified by the fact that | Ja € a for a transitive seta. 4 


Theorem 4F Every natural number is a transitive set. 


Proof by induction We form the set of numbers for which the theorem 
is true; let T = {ne œ |n is a transitive set}. It suffices to show that T is 
inductive, for then T = a. 

Trivially Oe T. If ke T, then 


(J(k*)=k by the preceding theorem 


ckt, 


` 


whence k* e T. Thus T is inductive. 4 


We can now complete the proof of Theorem 4D; it remained for us to 
show that the successor operation on œw is one-to-one. If m* =n* for m 
and n in œw, then |_)(m*) = | )(n*). But since m and n are transitive sets, we 
have | }(m*) = m and | J(n*) = n by Theorem 4E. Hence m = n. 4 


Theorem 4G The set œ is a transitive set. 


This theorem can be stated as: Every natural number is itself a set of 
natural numbers. We will later strengthen this to: Every natural number is 
the set of all smaller natural numbers. 


Proof by induction We want to show that (Vn € œ) n S œ. Form the set 
of n’s for which this holds: 


T ={new|nc at. 
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We must verify that T is inductive. Clearly Oe T. If ke T, then we have 
kow and {k} So, 


whereupon k u {k} € œ, thus showing that kt e T. And so T is inductive 
and therefore coincides with w. 4 


Exercises 


2. Show that if a is a transitive set, then a* is also a transitive set. 


3. (a) Show that if a is a transitive set, then a is also a transitive set. 
(b) Show that if Pa is a transitive set, then a is also a transitive set. 


4. Show that if a is a transitive set, then (Ja is also a transitive set. 


5. Assume that every member of æ is a transitive set. 
(a) Show that |_). is a transitive set. 
(b) Show that ().# is a transitive set (assuming that Z is nonempty). 


6. Prove the converse to Theorem 4E: If |_)(a*) = a, then a is a transitive 
set. 


RECURSION ON @ 


Consider the following guessing game. Suppose I am thinking of a 
function h: œ -> A. Possibly I am reluctant to tell you directly what the 
values of this function are. Instead I reveal (i) what h(0) is, and (ii) a function 
F: A > A such that h(n*) = F(h(n)) for all ne œ. This then gives away all 
the information; you can compute successively 


and so forth. 

Now for a harder problem. Suppose we are given a set A, an element 
a € A,and a function F: A > A. How can we show that there exists a function 
h: w- A such that (i) h(0) =a, and (ii) h(n*) = F(h(n)) for each new. 
The preceding paragraph tells how to compute h if it exists. But we now 
want to prove that there exists a set h that is a function meeting the 
above conditions. 


Recursion Theorem on œw Let A be a set, ae A, and F: A> A. Then 
there exists a unique function h: œ > A such that 


h(0) = a, 
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and for every n in @, 
h(n*) = F(h(n)). 


Proof? The idea is to let h be the union of many approximating 
functions. For the purposes of this proof, call a function v acceptable iff 
dom v £ o, ran v € A, and the following conditions hold: 


(i) Ife dom v, then v(0) = a. 
(ii) Ifn* e domv (wheren e w), then also n € dom vand v(n*) = F(v(n)). 


Let % be the collection of all acceptable functions, and let h = |)#. 
Thus 


(x) <n, y>€h iff <n, y> isa member of some acceptable v 
iff v(n) = y for some acceptable v. 


We claim that this h meets the demands of the theorem. This claim can 
be broken down into four parts. The four parts involve showing that 
(1) h is a function, (2) h is acceptable, (3) dom h is all of œ, and (4) h is 
unique. 

1. Wefirst claim that h is a function. (Proving this will, in effect, amount 
to showing that two acceptable functions always agree with each other 
whenever both are defined.) Let S be the set of natural numbers at which 
there is no more than one candidate for h(n): 


S = {ne œ | for at most one y, <n, y> € h}. 


We must check that S is inductive. If <0, y,>¢h and <0, y,> € h, then by 
(yr) there exist acceptable v, and v, such that v,(0) = y, and v,(0) = y2. 
But by (i) it follows that y, = a = y,. Thus O€ S. 

Next suppose that ke S; we seek to show that k* e S. Toward that 
end suppose that <k*, y,>€h and <k*, ya Eh. Again there must exist 
acceptable v, and v, such that v,(k*) = y, and v,(k*) = y,. By condition 
(ii) it follows that 


y, =0,(k*) = Fok) and — y, =», (k*) = Fo, (k)). 
But since ke S, we have v,(k) = v,(k). (This is because <k, v,(k)> and 
<k, v,(k)> are in h.) Therefore 
y, = F(v,(k)) = Fe, (&) = y,- 


This shows that kt e S. Hence S is inductive and coincides with œ. 
Consequently h is a function. 


3 This proof is more involved than ones we have met up to now. In fact, you might 
want to postpone detailed study of it until after seeing some applications of the theorem. 
But it is an important proof, and the ideas in it will be seen again (in Chapters 7 and 9). 
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2. Next we claim that h itself is acceptable. We have just seen that h 
is a function, and it is clear from (+) that dom h S wand ran h S A. 

First examine (i). If 0 € dom h, then there must be some acceptable v 
with v(0) = A(0). Since v(0) = a, we have h(0) = a. 

Next examine (ii). Assume n* e dom h. Again there must be some 
acceptable v with v(n*) = h(n*). Since v is acceptable we have ne dom v 
(and v(n) = h(n)) and 


h(n*) = v(n*) = F(o(n)) = F(A(n). 


Thus A satisfies (ii) and so is acceptable. 

3. We now claim that dom h = œ. It suffices to show that dom h is 
inductive. The function {<0, a>} is acceptable and hence 0 € dom h. Suppose 
that k e dom h; we seek to show that kt e dom hA. If this fails, then look at 


v= hu {ck*, F(h(k))>}. 


Then v is a function, dom v S œw, and ran v & A. We will show that v is 
acceptable. 

Condition (i) holds since v(0) = h(0) = a. For condition (ii) there are two 
cases. If n* e dom v where n* # k*, then n* e dom h and v(n*) = h(n*) = 
F(h(n)) = F(v(n)). The other case occurs if n* = k*. Since the successor 
operation is one-to-one, n = k. By assumption k e dom h. Thus 


v(k*) = F(h(k)) = F(v(k)) 
and (ii) holds. Hence v is acceptable. But then v S h, so that k* e dom h 
after all. So dom h is inductive and therefore coincides with œ. 


4. Finally we claim that h is unique. For let h, and h, both satisfy 
the conclusion of the theorem. Let S be the set on which h, and h, agree: 


S = {ne œ | h (n) = h(n). 
Then S is inductive; we leave the details of this to Exercise 7. Hence 
S=qandh, =h,. 4 
Examples Let Z be the set of all integers, positive, negative, and zero: 
Z={..., -1,0,1,2,..3. 
There is no function h: Z —> Z such that for every ae Z, 
h(a + 1) = h(a)? + 1. 


(For notice that h(a) > h(a — 1) > h(a—2)>--+>0.) Recursion on œ 
relies on there being a starting point 0. Z has no analogous starting point. 
For another example, let 


if a<0O, 


a+l 
F = 
(=), if a20. 
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Then there are infinitely many functions h: Z —> Z such that h(0) = 0 and 
for every a in Z, h(a + 1) = F(h(a)). The graph of one such function is 
shown in Fig. 17. 


Digression There is a classic erroneous proof of the recursion theorem 
that people have sometimes tried (even in print!) to give. The error is 
easier to analyze if we apply it not to œ but instead to an arbitrary Peano 
system <N, S, eX. Given any a in A and any function F: A > A, there is a 
unique function h: N > A such that h(e) = a and A(S(x)) = F(h(x)) for each 
x in N. The erroneous proof of this statement runs as follows: 

“We apply the Peano induction postulate to dom h. First of all, we are 


Fig. 17. h(a + 1) is h(a) + 1 when h(a) < 0 and is h(a) when h(a) > 0. 


told that h(e) = a, and so h is defined at e, i.e., e € dom h. And whenever 
xedom h then immediately h(S(x)) = F(h(x)), so S(x)e dom h as well. 
Hence dom h is closed under S. It follows (by induction) that dom h = N, 
i.e., h is defined throughout N.” 

What is wrong? Well, for one thing, the proof talks about the function 
h before any such function is known to exist. One might think that a 
little rewording would get around this objection. But no, a closer examin- 
ation of the proof shows that it does not utilize conditions (i) and (ii) in 
the definition of Peano systems. The recursion theorem is in general false 
for systems not meeting those conditions, such as the systems of Fig. 16a 
or Fig. 16b. So any correct proof of recursion absolutely must make use 
of conditions (i) and (ii), as well as using induction. (Our proof of 
recursion on œ uses these conditions in part 3.) 


Our first application of the recursion theorem will be to show that any 
Peano system is “just like” <œ, c, 0>. There are other Peano systems; for 
example, let N be the set {1, 2, 4, 8, ...} of powers of 2, let S(n) = 2n, 
and let e = 1. Then <N, S, e> is a Peano system. The following theorem 
expresses the structural similarity between this Peano system and <o, a, 0). 


a 
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Theorem 4H Let <N, S, e> be a Peano system. Then <œ, c, OX is 
isomorphic to <N, S, eJ, i.e., there is a function h mapping w one-to-one 
onto N in a way that preserves the successor operation 


and the zero element 
h(0) = e. 


Remark The equation h(a(n)) = S(h(n)) (together with h(0) = e) implies 
that h(1) = S(e), h(2) = S(S(e)), h(3) = S(S(S(e))), etc. Thus the situation 
must be as shown in Fig. 18. 


0 l 2 3 o 
. _ e cc o [c o h rrie 


TT. 


S(e) S{S(e)) S(S(S(e))) 


Fig. 18. Isomorphism of Peano systems. 


Proof By the recursion theorem there is a unique function h: œ > N 
such that h(0) = e and for all ne œ, h(n*) = S(h(n)). It remains to show 
that h is one-to-one and that ran h = N. 

To show that ran h= N we use the Peano induction postulate for 
<N, S, e>. Clearly e e ran h. Also for any x € ran h (say x = h(n)) we have 
S(x) € ran h (since S(x) = h(n*)). Hence by the Peano induction postulate 
applied to ran h, we have ran h = N. 

To show that h is one-to-one we use induction in œw. Let 


T = {n € œ | for every m in œ different from n, h(m) 4 h(n)}. 


First we claim that 0e T. Any me œ different from 0 must be of the form 
p* (by Theorem 4C). And h(p*)=S(h(p)) +e since e¢ ran S. Hence 
h(0) = e + h(p*), and consequently 0 e T. 

Now assume that ke T and consider kt. Suppose that h(k*) = h(m). 
Then m # 0 by the preceding paragraph, so m = p* for some p. Thus 


S(h(k)) = h{k*) = h(p*) = S(h(p)). 


Since S is one-to-one, this leaves the equation h(k) = h(p). Since ke T, 
we have k = p. Hence kt = pt = m. This shows that k* eT. So T is 
inductive, and thus coincides with w. Consequently h is one-to-one. 4 
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Theorems 4D and 4H relate the constructive approach to the natural 
numbers and the axiomatic approach. Theorem 4D shows that Peano’s 
postulates are true of the number system we have constructed. And Theorem 
4H shows that the number system we have constructed is, “to within 
isomorphism,” the only system satisfying Peano’s postulates. 


Exercises 


7. Complete part 4 of the proof of the recursion theorem on œ. 


8. Let f be a one-to-one function from A into A, and assume that 
ce A -— ran f. Define h: œw > A by recursion: 


h(0) = c, 
h(n*) = f (h(n)). 
Show that h is one-to-one. 


9. Letf bea function from B into B, and assume that A S B. We have 
two possible methods for constructing the “closure” C of A under f. First 
define C* to be the intersection of the closed supersets of A: 


C*=(\\X|AGXSBE& fIX]S X}. 


Alternatively, we could apply the recursion theorem to obtain the function 
h for which 


h(0) = A, 
h(n*) = h(n) o fLh(n)]. 
Clearly h(0) = h(1) & +++; define C, to be |] ran h; in other words 

C,= Ur 
Show that C* = C„. [Suggestion: To show that C* S C, show that 
SIC, 1S C,. To show that C, S C*, use induction to show that h(n) = C*.] 


10. In Exercise 9, assume that B is the set of real numbers, f (x) = x?, and 
A is the closed interval [4, 1]. What is the set called C* and C, ? 


11. In Exercise 9, assume that B is the set of real numbers, f(x) = x — 1, 
and A = {0}. What is the set called C* and C,? 


12. Formulate an analogue to Exercise 9 for a function f: B x B —> B. 
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_ ARITHMETIC* 


Wecan apply the recursion theorem to define addition and multiplication 
on œ. (Another way of obtaining these operations will be discussed in 
Chapter 6.) For example, suppose we want a function A 5, 2 > œ such that 
A,(n) is the result of adding 5 to n. Then A, must satisfy the conditions 

A,(0)=5, 
A,(n*)=A,(n)*  fornino. 
The recursion theorem assures us that a unique such function exists. 


In general, for each mew there exists (by the recursion theorem) a 
unique function A: @ > œ for which 


A (0)=m, 
m 
+) _ + : 
A (n*) = A, (n) for nin œ. 
But we want one binary operation +, not all these little one-place functions. 
Definition A binary operation on a set A is a function from A x A into A. 


Definition Addition (+) is the binary operation on œ such that for any 
mand n in a, 
m+n = A (n). 
Thus when written as a relation, 
+ ={<m, n), py |mEw&new& p= An(n) 
In conformity to everyday notation, we write m + n instead of + (m, n) or 
+(<m, ny). 
Theorem 4I For natural numbers m and n, 
(A1) m+0=m, 
(A2) m+n* =(m+n)*. 

This theorem is an immediate consequence of the construction of Am 
Observe that (A1) and (A2) serve to characterize the binary operation + in 
a recursive fashion. Our only reason for using the A_’s is that the recursion 
theorem applies directly to functions with domain œ, not domain œ x œ. 
We can now forget the A,,s, and use + and Theorem 4I instead. 

We can now proceed to construct the multiplication operation in much 


the same way. We first apply the recursion theorem to obtain many functions 
M „ © > œ where M (n) is the result of multiplying m by n. Specifically, 


* Readers planning to omit Chapter 5 are permitted to skip this section also. 
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for each me œ there exists (by the recursion theorem) a unique function 
M „œ >o for which 


M, (n*) =M (n) +m. 


( 
Definition Multiplication (-) is the binary operation on œ such that for 
any m and n in o, 


m:n = M„(n). 
The theorem analogous to Theorem 4 is the following. 


Theorem 4J For natural numbers m and n, 


(M1) m:0=0, 
(M2) mont =m'n+m. 
We can now discard the M„ functions, and use - and Theorem 4J 
instead. 
We could, in the same manner, define the exponentiation operation on œw. 
The equations that characterize exponentiation are 
(E1) m? = 1, 
(E2) me) = m": m. 
Example 2 + 2 = 4 (we would be alarmed if this failed), as the following 
calculation demonstrates: 
2+0=2 by (A1), 
24+1=24+0* 
= (2+ 0)* by (A2) 


=(2+1)* by (A2) 


Having now given set-theoretic definitions of the operations of arithmetic, 
we next verify that some of the common laws of arithmetic are ‘provable 
within set theory. This verification is additional evidence, albeit at an 
elementary level, that mathematics can be embedded in set theory. 
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Theorem 4K The following identities hold for all natural numbers. 
(1) Associative law for addition 
> m+(n+p)=(m+n)+p. 
(2) Commutative law for addition 
m+n=n+m. 
(3) Distributive law 
m:(n+p)=m:n+m:'p. 
(4) Associative law for multiplication 
m(n: p) = (m:n): p. 
(5) Commutative law for multiplication 
m'n=n:m. 
Proof. Each part is proved by induction. This exemplifies a general fact: 


When a function has been constructed by use of the recursion theorem, 
then general properties of the function must usually be proved by induction. 


(1) The proof uses induction on p. That is, consider fixed natural 
numbers m and n, and define A = {p € œ | m + (n + p) = (m + n) + p}. We 
leave the verification that A is inductive for Fxercise 15. 

(2) It is necessary to prove two preliminary facts, each of which is 
proved by induction. 

The first preliminary fact is that O+ n =n for all new. Let A= 
{ne wœ |0 +n = n}. Then 0€ A by (A1). Suppose that k e A. Then 


0+k*=(0+k)* by (A2) 
=k* since k € A, 
and hence k* e A. So A is inductive. 

The second preliminary fact is that m* + n = (m + n)* for m and n in œ. 
Consider any fixed me œw and let B = {ne w|m* +n = (m + n)*}. Again 
Oe B by (A1). Suppose that k e B. Then 

m* +k* =(m* +k)* by (A2) 
=(m+k)** since ke B 
=(m+k*)* by (A2), 


showing that k* e B. Hence B is inductive. 
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Finally we are ready to prove the commutative law. Consider any 
new and let C = {me w|m +n = n + m}. By the first preliminary fact, 
0+n=n=n+0, whence 0e C. Suppose that k e C. Then 


kt +n=(k+n)* by second fact 
= (n+ k)* since k e C 
=n+k* by (A2), 
so that k* e C. Hence C is inductive. 
(3) Consider fixed m and n in œ and let 
A={peo|m:(n+p)=m:n+m: p} 
To check that 0 € A, observe that 
m:(n+0)=m-n by (A1) 
=m:n+0 by (A1) 
=m:n+m:0  by(M1). 
Now suppose that k € A. Then 
m:(n+k*)=m:(n+k)* by (A2) 
=m-(n+k)+m by (M2) 
=(m-n+m-k)+m _ — sinceke A 
=m:nt+(m-k+m) by part (1) 
=m-n+m-k* by (M2), 
which shows that k* e A. Hence A is inductive. 
The reader has no doubt observed that each inductive argument here 
is quite straightforward. And each is, for that matter, much like the next. 
(4) Consider fixed m and n in œw and let 
A= {pew|m-(n- p)= (m:n): p} 
To check that Oe A we note that m-(n-0)=m-0=0 by (M1), and 
(m-n)- 0 = 0 as well. Now suppose that k e A. Then 
m:(n-k*)=m-(n-k +n) by (M2) 
=m-(n-k)+m-n__ by part (3) 
=(m-n)-k+m-n_— sinceke A 
= (m:n): k* by (M2), 
which shows that k* e A. Hence A is inductive. 
(5) The proof here follows the outline of part (2). There are the 
analogous two preliminary facts to be proved: O-n =0 and m*-n= 


m:n + n. The details of the three inductive arguments are left for Exercise 
16. 4 
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Exercises 

13. Let mand n be natural numbers such that m n = 0. Show that either 
m=0Oorn=0. 

14, Call a natural number even if it has the form 2-m for some m. Call 


it odd if it has the form (2-p)+ 1 for some p. Show that each natural 
number is either even or odd, but never both. 


15. Complete the proof of part (1) of Theorem 4K. 
16. Complete the proof of part (5) of Theorem 4K. 
17. Prove that m"*? = m": mP. 


ORDERING ON © 


We have defined natural numbers in such a way that, for example, 4 € 7. 
This may have appeared to be a spurious side effect of our definition, but we 
now want to turn it to our advantage. We have the following strikingly 
simple definition of order on œ: For natural numbers m and n, define m to be 
less than n iff me n. We could introduce a special symbol “<” for this: 


m<n iff men. 


But the special symbol seems unnecessary, we can just use “e”. But it 
will be necessary to keep in mind the dual role of this symbol, which 
denotes both membership and ordering. In place of an < symbol, we define 


men iff either men or m=n. 


Observe that 
pekt < pek, 


a fact we will use in later calculations. 
We are now entitled to state the following fact: Any natural number is 
just the set of all smaller natural numbers. That is, for any n in œ, 


xisa member ofn < xew & x is less than n. 
To verify this, note that we can restate it as 
xen = xew&xen, 


which is true because w is a transitive set, and thus x € n E œ = xe. 

We should show that we do indeed have a linear ordering relation on 
œ, in the sense defined in Chapter 3. The relation in question is the set of 
ordered pairs € „ defined by 


e, = {<m, n> Ewx w| men}. 
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We will prove that this is a linear ordering relation on œ, i.e., that it is a 
transitive relation that satisfies trichotomy on w. 
Because each natural number is a transitive set, we have for m, n, p in œ: 


men&nep => mep. 


That is, our ordering relation on œw is a transitive relation. It is somewhat 
harder to show from our definitions that of any two distinct natural 
numbers, one is larger than the other. For that result, we will need the 
following lemma. 


Lemma 4L (a) For any natural numbers m and n, 
men iff m*en*. 
(b) No natural number is a member of itself. 


Proof (a) First assume that m* en*. Then we have mem* en. 
Hence (by the transitivity of n) we obtain m e€ n. 
To prove the converse we use induction on n. That is, form 


T ={new|(Vmen)m* en*}. 


Then 0 e T vacuously. Consider any ke T. In order to show that kt eT, 
we must show that whenever me kt, then m* e ktt. Given mekt, we 
have either m = k (in which case m* = k*t €k**) or me k. In the latter case 
(since ke T), m* ek* & k**. So in either case we get m* e k** and thus 
k* e T. Hence T is inductive and coincides with œ. 

Part (b) follows easily from (a). Let 


T ={new|n€n}. 


Then OeT since nothing is a member of 0. And by part (a), 
kk = k*t €k*. Hence T is inductive and coincides with w. 4 


(In Chapter 7 we will come to the regularity axiom, which implies 
among other things that no set is a member of itself. But for natural 
numbers we can get along without the regularity axiom.) 


We next use the lemma to prove that for two distinct natural numbers, 
one is always a member of the other. (It is the smaller one that is a member 
of the larger one.) 


Trichotomy Law for œ For any natural numbers m and n, exactly one 
of the three conditions 


men, m=h, nem 


holds. 


Ordering on w 85 


Proof First note that at most one can hold. If men and m =n, then 
mem, in violation of Lemma 4L(b). Also if me nem, then because m is a 
transitive set we again have me m. Í 

It remains to show that at least one holds. For that we use induction; let 


T ={nea|(¥meo)(menorm=nornem). 


In order to show that 0 € T, we want to show that 0 e m for all m. This we 
do by induction on m. (An induction within an induction!) Clearly 
0e0, and if 0€ k, then 0e kt. Hence 0e T. 

Now assume that k € T and consider k*. For any m in œw we have (since 
ke T) either mek (in which case me k*) or kem. In the latter case 
k* e m* by Lemma 4L(a), and so k* e m. Thus in every case, either m e k* 
or k* = mor kt em. And so k* e T, T is inductive, and we are done. 4 


A set A is said to be a proper subset of B (A c B) iff it is a subset of B 
that is unequal to B. 


AcB = AGCB&AFB. 


Ordering on @ is given not only by the membership relation, but also by 
the proper subset relation: 


Corollary 4M_ For any natural numbers m and n, 
men iff mcn 
and 
men iff mon. 
Proof Since n is a transitive set, 
men => m&n, 


and the inclusion is proper by Lemma 4L(b). Conversely assume that 
m cn. Then m £ n, and n ¢ m lest n e n. So by trichotomy m e n and we are 
done. 4 


The above proof uses trichotomy in a typical way: To show that me n, 
it suffices to eliminate the other two alternatives. 

The following theorem gives the order-preserving properties of addition 
and multiplication. The theorem will be used in Chapter 5 (but not elsewhere). 


Theorem 4N For any natural numbers m, n, and p, 
men <> m+pen+p. 
If, in addition, p # 0, then 


men <> m'pen'p. 
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Proof First consider addition. For the “=” half we use induction on p. 
Consider fixed m € ne œw and let 


A={pew|m+ pens p}. 
Clearly 0 e A, and 


keA => mt+kent+k 
=> (m+k)* e(n+k)* by Lemma 4L/(a) 
=> mt+kten+k* by (A2) 
=> kteAd. 


Hence A is inductive and so A = œ. 

For the “<=” half we use the trichotomy law and the “=>” half. If 
m+pen+p, then we cannot have m =n (lest n+ pen+ p) nor nem 
(lest n+ pem + pen + p). The only alternative is men. 

For multiplication the procedure is similar. For the “=” direction, 
consider fixed m € n € œ and let 

B={qew|m-qten-q'*}. 


(Recall that for a natural number p # 0 there is some q € œ with q* = p.) 
It is easy'to see that Oe B, since m:0* =m:0+m =m. Suppose that 
ke B; we need to show that m-k** en-k**. Thus 


m-k*t+=m-k*t +m 
en:kt* +m 


by applying the first part of the theorem to the fact that m- kt en-k*. 
And by again applying the first part of the theorem (this time to the fact that 
men), 


n-kt+men-k* +n 
=n-kt*, 


Hence k* e B, B is inductive, and B = o. 
The “<=” half then follows exactly as for addition. 4 


Corollary 4P The following cancellation laws hold for m, n, and p in w: 
mt+p=n+p > m=n, 
m:-p=n- p&p#0 > m=n. 

Proof Apply trichotomy and the preceding theorem. 4 


Well Ordering of œ Let A be a nonempty subset of œ. Then there is 
some me A such that men for all ne A. 
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Note Such an m is said to be least in A. Thus the theorem asserts that 
any nonempty subset of w has a least element. The least element is always 
unique, for if m, and m, are both least in A, then m, em, and m,em,. 


Consequently m, = m,. 


Proof Assume that A is a subset of œ without a least element; we will 
show that A = Ø. We could attempt to do this by showing that the comple- 
ment œ — A was inductive. But in order to show that kt € œ — A, it is not 
enough to know merely that k € œ — A, we must know that all numbers 
smaller than k are in œw — A as well. Given this additional information, we 
can argue that k* e œ — A lest it be a least element for A. 

To write down what is approximately this argument, let 


B = {meq | no number less than m belongs to A}. 


We claim that B is inductive. 0 € B vacuously. Suppose that k € B. Then if 
n is less than k+, either n is less than k (in which case n ¢ A since k e B) or 
n = k (in which case n ¢ A lest, by trichotomy, it be least in A). In either case, 
n is outside of A. Hence k* € B and so B is inductive. It clearly follows that 
A = @; for example, 7 ¢ A because 8 e B. 4 


Corollary 4Q There is no function f: œ > œ such that f(n*) e f (n) for 
every natural number n. 


Proof The range of f would be a nonempty subset of œw without a least 
element, contradicting the well ordering of œ. 4 


Our proof of the well ordering of œw suggests that it might be useful to 
have a second induction principle. 


Strong Induction Principle for œ Let A be a subset of œw, and assume 
that for every n in a, 


if every number less than n is in A, then ne A. 
Then A = œ. 


Proof Suppose, to the contrary, that A # œ. Then œw — A # Ø, and by 
the well ordering it has a least number m. Since m is least in œ — A, all 
numbers less than m are in A. But then by the hypothesis of the theorem 
me A, contradicting the fact that m € w — A. 4 


The well-ordering principle provides an alternative to proofs by 
induction. Suppose we want to show that for every natural number, a certain 
statement holds. Instead of forming the set of numbers for which the 
statement is true, consider the set of numbers for which it is false, i.e., the 
set C of counterexamples. To show that C = Ø, it suffices to show that C 
cannot have a least element. 
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Exercises 
18. Simplify: e7 '[{7, 8}]. 


19. Prove that if m is a natural number and d is a nonzero number, then 
there exist numbers q and r such that m = (d > q) + r and r is less than d. 


20. Let A bea nonempty subset of œ such that |] A = A. Show that A = o. 
21. Show that no natural number is a subset of any of its elements. 
22. Show that for any natural numbers m and p we have mem + p*. 


23. Assume that mand n are natural numbers with m less than n. Show that 
there is some p in œ for which m+ p* =n. (It follows from this and 
the preceding exercise that m is less than n iff (Ip € œ) m + p* = n.) 


24. Assume that m + n = p + q. Show that 
mep <> qen. 
25. Assume that ne m and q € p. Show that 


(m-q) + (n: p)e(m:p)+ (nq). 
[Suggestion: Use Exercise 23.] 


26. Assume that new and f:n* >w. Show that ran f has a largest 
element. 


27. Assume that A is a set, G is a function, and f, and f, map œ into A. 
Further assume that for each n in œ both f, ? nand f, tn belong to dom G 
and 


S) = GU, Tn) & faln) = GC, t n). 
Show that f, = f3- 


28. Rewrite the proof of Theorem 4G using, in place of induction, the well 
ordering of œ. 


Review Exercises 


29. Write an expression for the set 4 using only symbols Ø, {, }, and 
commas. 


30. What is (J4? What is (4? 
31. What is JU7? 


32. (a) Let A= {1}. Calculate A* and | )(A*). 
(b) What is ()((2}*)? 
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33. Which of the following sets are transitive? (For each set S that is not 
transitive, specify a member of |_)S not belonging to S.) 
(a) {0, 1, {1}. 
(b) {1}. 
(c) <0, 1>. 
34. Find suitable a, b, etc. making each of the following sets transitive. 
(a) KØN, a, b). 
(b) (U c d, ej. 
35. Let S be the set <1, 0>. 
(a) Find a transitive set T, for which S S T. 
(b) Find a transitive set T, for which S e T}. 


36. There is a function h: œ > œ for which h(0) = 3 and h(n*) = 2: h(n). 
What is h(4)? 
37. We will say that a set S has n elements (where n € œ) iff there is a one-to- 
one function from n onto S. Assume that A has m elements and B has n 
elements. 
(a) Show that if A and Bare disjoint, then A U B has m + n elements. 
(b) Show that A x B has m:n elements. 


38. Assume that h is the function from @ into œ for which h(0) = 1 and 
h(n*) = h(n) + 3. Give an explicit (not recursive) expression for h(n). 
39. Assume that h is the function from œ into œ for which A(0) = 1 and 
h(n*) = h(n) + (2: n) + 1. Give an explicit (not recursive) expression for h(n). 
40. Assume that h is the function from œ into œ defined by 
h(n) = 5-n + 2. Express h(n*) in terms of h(n) as simply as possible. 


CHAPTER 5 


CONSTRUCTION OF 
THE REAL NUMBERS’ 


In Chapter 4 we gave a set-theoretic construction of the set w of 
natural numbers. In the present chapter we will continue to show how 
mathematics can be embedded in set theory, by giving a set-theoretic 
construction of the real numbers. (The operative phrase is “can be,” not 
“is” or “must be.” We will return to this point in the section on “Two.”) 


INTEGERS 


First we want to extend our set œw of natural numbers to a set Z of integers 
(both positive and negative). Here “extend” is to be loosely interpreted, 
since w will not actually be a subset of Z. But Z will include an 
“isomorphic copy” of œ (Fig. 19). 

A negative integer can be named by using two natural numbers and a 
subtraction symbol: 2 — 3, 5 — 10, etc. We need some sets to stand behind 
these names. 

As a first guess, we could try taking the integer — 1 to be the pair <2, 3 
of natural numbers used to name —1 in the preceding paragraph. And 


1 Other chapters do not depend on Chapter 5. 
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similarly we could try letting the integer — 5 be the pair <5, 10> of natural 
numbers. But this first guess fails, because —1 has a multiplicity of 
names: 2 — 3 = 0 — 1 but <2, 35 # <0, 1>. 

As a second guess, we can define an equivalence relation ~ such that 
<2, 3> ~ <0, 1>. (Imposing such an equivalence relation is sometimes 
described as “identifying” <2, 3> and <0, 1>.) Then we will have the one 
equivalence class 


[<2, 3>] = [K0, 1], 


and we can take — 1 to be this equivalence class. Then for the set Z of all 
integers, we can take the set of all equivalence classes: 


Z = (@ x w)/~. 


This is in fact what we do. Call a pair of natural numbers a difference; 
then an integer will be an equivalence class of differences. Consider two 
differences <m, n> and <p, q). When should we call them equivalent? 


Fig. 19. There is a subset of Z that looks like œ. 


Informally, they are equivalent iff m — n = p — q, but this equation has no 
official meaning for us yet. But the equation is equivalent to m+ q = p + n, 
and the latter equation is meaningful. Consequently we formulate the 
following definition. 


‘Definition Define ~ to be the relation on œw x œ for which 
<m,n>~<p,qg> iff m+q=ptn. 


Thus ~ is a set of ordered pairs whose domain and range are also sets 
of ordered pairs. In more explicit (but less readable) form, the above 
definition can be stated: 


~ = {<<m, n>, <p, >> |m +q = p + n and all are in œ}. 
Theorem SZA The relation ~ is an equivalence relation on w x œ. 


Proof We leave it to you to check that ~ is reflexive on w x œ and is 
symmetric. To show transitivity, suppose that <m, nò ~ <p, q> and 
<P, qa) ~ <r, s>. Then (by the definition of ~) 


m+qtpts=ptnt+r+gq. 
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By use of the cancellation law (Corollary 4P), we obtain m + s = r + n, and 
thus <m, n> ~ <r, s). 4 


Definition The set Z of integers is the set (œ x w)/~ of all equivalence 
classes of differences. 


For example, the integer 2, is the equivalence class 


[<2, 0>] = {<2, 0>, <3, 1), <4, 2), «4 


and the integer —3, is the equivalence class 


[<0, 3>] = {<0, 3>, <1, 4>, <2, 5>, «-}- 


These equivalence classes can be pictured as 45° lines in the Cartesian 
product œ x w (Fig. 20). 

Next we want to endow Z with a suitable addition operation. Informally, 
we can add differences: 


(m — n) + (p — q) = (m+ p) — (n + q). 


This indicates that the correct addition function + for integers will satisfy 
the equation 


[<m, n>] +2[<p, OD] = [Km + pn +>] 


This equation will serve to define +, once we have verified that it makes 
sense. The situation here is of the sort discussed in Theorem 3Q. We want 
to specify the value of the operation +, at a pair of equivalence classes 
by (1) selecting representatives <m, n> and <p, q) from the classes, (2) 


—4 0 


Fig. 20. An integer is a line in œ x a. 
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<P. q> 
Fig. 21. Addition of lines is well defined. 


operating on the representatives (by vector addition in this case), and then 
(3) forming the equivalence class of the result of the vector addition. For 
+, to be well defined, we must verify that choice of other representatives 
<m, n» and <p’, q’> from the given classes would yield the same equivalence 
class for the sum (Fig. 21). 


Lemma 5ZB If <m, n> ~ <m, nò and <p, gd ~ <p’, g>, then 
<m+p,n+q>~<m' + pn t+q’>. 
Proof We are given, by hypothesis, the two equations 
m+n=m' +n and ptqd=p +q. 
We want to obtain the equation 
m+pt+nt+q =m +p +nt+q@. 
But this results from just adding together the two given equations. 4 


This lemma justifies the definition of + ,. In the terminology of 
Theorem 3Q, it says that the function F of vector addition 


F(<m, n>, <p, q>) = <m + p, n+ q> 
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is compatible with ~. Hence there is a well-defined function F on the 
quotient set; F is just our operation +,. It satisfies the equation 


[<m, n>] +z [<p, g>] = [Km + p, n +Q)]. 
In other words, for integers a and b our addition formula is 
a+,b=[<m + p, n +q], 


where <m, n> is chosen from a and <p, q> is chosen from b. Theorem 3Q 
assures us that the equivalence class on the right is independent of how 
these choices are made. 


Example We can calculate 2,+,(—3,). Since 2,=[<2, 0>] and 
—3, = [<0, 3], we have 
22 +72(-3,) = [<2, 0>] +z[<0, 3] 
= [{2 + 0,0 + 3] 
=-l,. 


The familiar properties of addition, such as commutativity and 
associativity, now follow easily from the corresponding properties of 
addition of natural numbers. 


Theorem 5ZC The operation +; is commutative and associative: 
a+,b=b+,a4, 
(a+,b)+,c=a+,(b+,0). 


Proof The integer a must be of the form [(m, n>] for some natural 
numbers m and n; similarly b is [<p, q>], Then: 


a +,b=[<m, n>] +z [<P 49] 
=[{m+p,n+q>] by definition of +, 
=[<ptmqtn] by commutativity of + on w 
= Kp, a>] +2[<m, n>] 
=b+ a. 
The calculation for associativity is similar (Exercise 4). 4 
Let 0, = [<0, 0>]. Then it is straightforward to verify thata+,0, =a 
for any integer a, i.e., 0, is an identity element for addition. And the new 


feature that Z has (and the feature for which the extension from @ to Z 
was made), is the existence of additive inverses. 
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Theorem 5ZD (a) 0, is an identity element for +7: 
a+,0,=a 
for any a in Z. 


(b) Additive inverses exist: For any integer a, there is an integer b 
such that 


a+ ,b=0,. 


Proof (b) Given an integer a, it must be of the form [<m, n)]. 
Take b to be [<n, m>]. Then a+,6 = [<m + n, n + m>] = [<0, 0>] = 03. 
4 


Theorems 5ZC and 5ZD together say that Z with the operation +, 
and the identity element 0, is an Abelian group. The concept of an 
Abelian group is central to abstract algebra, but in this book the concept 
will receive only passing attention. 

Inverses are unique. That is, ifa +,b =0,anda+,b’ =0,, then b = b. 
To prove this, observe that 

b=b+,(a+,b'/)=(b+ za) +, =b. 


(This proof works in any Abelian group.) The inverse of a is denoted as 
—a. Then as the proof to Theorem 5ZD shows, 


—[Xm, n>] = [<n, mò]. 


Inverses provide us with a subtraction operation, which we define by the 
equation 


b—a=b+;(-a). 

We can also endow the set Z with a multiplication operation, which 
we obtain in much the same way as we obtained the addition operation. 
First we look at the informal calculation with differences 

(m — n): (p — q) = (mp + ną) — (mq + np), 
which tells us that the desired operation -, will satisfy the equation 
[<m, n>] -z[<p, a>] = [Kmp + ng, mq + np)]. 


(Here we write, as usual, mp in place of m> p.) Again we must verify that 
the above equation characterizes a well-defined operation on equivalence 
classes. That is, we must verify that the operation on differences 


G(<m, n>, <p, q>) = <mp + ng, mq + np) 


is compatible with ~. This verification is accomplished by the following 
lemma. 
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Lemma 5ZE If <m, n> ~ <m, nò and <p, q) ~ <p’, q’>, then 
<mp + ng, mq + np> ~ lmp + nq’, mq + n'p’>. 
Proof We are given the two equations 
(1) m+n =m +n, 
(2) p+q=p+q, 
and we want to obtain the equation 
mp +ngqgt+mqd +n'p' =mp +n'q + mq t np. 


The idea is take multiples of (1) and (2) that contain the terms we need. 
First multiply Eq. (1) by p; this gives us an mp on the left and an np on 
the right. Second, multiply the reverse of Eq. (1) by q; this gives us an ng 
on the left and an mq on the right. Third, multiply Eq. (2) by m’. Fourth, 
multiply the reverse of Eq. (2) by n’. Now add the four equations we have 
obtained from (1) and (2). All the unwanted terms cancel, and we are left 
with the desired equation. It works. 4 


As for addition, we can prove the basic properties of multiplication 
from the corresponding properties of multiplication of natural numbers. 


Theorem 5ZF The multiplication operation -, is commutative, associa- 
tive, and distributive over +7: 


a-zb=b-,a 
(a-,b) 2c =a'z(b'ze) 
a-z(b+zc) = (a'zb) +z (azc) 


Proof Say that a = [<m, n>] and b =[<p, q>]. For the commutative 
law, we have 


a+,b = [<mp + ng, mq + np>], 
whereas 


b-,a = [Kpm + qn, pn + qm)]. 


The equality of these two follows at once from the commutativity of 
addition and multiplication in œw. 

The other parts of the theorem are proved by the same method. Say 
that c = [<r, s>]. Then (a 'zb)'zc is 


[Kmp + ng)r + (mq + np)s, (mp + ng)s + (mq + np)r>], 
where a‘, (b-,c) is 


[<m(pr + qs) + n(ps + qr), m(ps + qr) + n(pr + qs)>}. 
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The equality of these follows from laws of arithmetic in w (Theorem 4K). 
As for the distributive law, when we expand a-,(b +,c), we obtain 


[<m(p + r) + nla + s), m(q + s) + nlp + 7)>], 
whereas when we expand azb +,a-,c we obtain 
[<mp + ng + mr + ns, mq + np + ms + nr)]. 
Again equality is clear from laws of arithmetic in œ. 4 


The remaining properties of multiplication that we will need constitute 
the next theorem. Let 1, be the integer [<1, 0]. 


Theorem 5ZG (a) The integer 1, is a multiplicative identity element: 
a‘zlz,=a 
for any integer a. 


(b) 0, #4 1,. 
(c) Whenever a-,b = 0,, then either a = 0, or b=0,. 


Part (c) is sometimes stated: There are no “zero divisors” in Z. 


Proof Part (a) is a trivial calculation. 

For part (b) it is necessary to check that <0, 0> + <1, 0>. This reduces 
to checking that 0 4 1 in œ, which is true. 

For part (c), assume that a # 0, and b ¥ 0,; it will suffice to prove that 
a'zb #0,. We know that for some m, n, p, and q: 


a=[<m, n)]}, b= Kp, a>} 
a'zb = [<mp + ng, mq + np). 

Since a # [<0, 0], we have m # n. So either m € n or n e m. Similarly, either 
peq or qep. This leads to a total of four cases, but in each case we have 
mp + nq #Amq+ np 
by Exercise 25 of Chapter 4. Hence a +b # [<0, 0X]. 4 


In algebraic terminology, we can say that Z together with +,,-+,,0,, 
and 1, forms an integral domain. This means that: 


(i) Z with +, and 0, forms as Abelian group (Theorems 5ZC and 
5ZD). 
(ii) Multiplication is commutative and associative, and is distributive 
over addition (Theorem 5ZF). 
(iii) 1, is a multiplicative identity (different from 0,), and no zero 
divisors exist (Theorem 5ZG). 
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There is a summary of these algebraic concepts near the end of this 
chapter. The value of the concepts stems from the large array of structures 
that satisfy the various conditions. In this book, however, we are concerned 
with only the most standard cases. 


Example The calculation 
[<O, 1>] -2[<m, n>] = Kn, mò] 
shows that —1, ‘za = —a. 


Next we develop an ofdering relation <, on the integers. The informal 
calculation 


m—-n<p—q iff m+q<ptnh 
indicates that ordering <, on Z should be defined by 
[<m, n>] <,[<p, a>] iff m+qep+n. 


As usual, it is necessary to check that this condition yields a well-defined 
relation on the integers. That is, we want to define 


a<,b iff m+qeptnh, 


where m, n, p, and q are chosen so that a = [<m, n>] and b = [<p, 9]. 
But that choice can be made in infinitely many ways; we must verify that 
we have the same outcome each time. The following lemma does just this. 


Lemma 5ZH_ If <m, ny ~ <m, n’> and <p, q) ~ <p’, q’), then 
m+qep+n iff m+gep+n. 
Proof The hypotheses give us the equations 
m+n =m +n and pt+tqd=p+q. 


In order to utilize these equations in the inequality m + q€ p + n, we add 
n’ and q’ to each side of this inequality: 


’ 


mt+tqeptn œ m+qtn+deptntnt+q 
< mintgqtqdeptqtntn 
> midep tn. 
Here the first and third steps use Theorem 4N, while the middle step uses 
the given equations. 4 


Theorem 5ZI The relation <, is a linear ordering relation on the, 
set of integers. 
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Proof We must show that <% is a transitive relation that satisfies 
trichotomy on Z. 

To prove transitivity, consider integers a = [<m, n>], b =[<p, q>], and 
c =[<r, s>]. Then 


a<,b&b<,c > m+qep+n&p+ser+q 
=> mt+qtseptnt+s&pt+stnert+qtn 
> m+qtsertqtn 
> m+sert+n by Theorem 4N 
- => a <z C. 
Proving trichotomy is easy. To say that exactly one of 
a<,zb, a=b, b<ņ,a 
holds is to say that exactly one of 
m+qEp+n, m+q=ptn, ptnem+q 
holds. Thus the result follows from trichotomy in œ. 4 
An integer b is called positive iff 0, <b. It is easy to check that 
b<,0, iff 0,<,—b. 


Thus a consequence of trichotomy is the fact that for an integer b, exactly 
one of the three alternatives l 


b is positive, b is zero, —b is positive 
holds. . 
The next theorem shows that addition preserves order, as does multiplica- 
tion by a positive integer. (The corresponding theorem for w was Theorem 
4N.) 


Theorem 5ZJ The following are valid for any integers a, b, and c: 
(a) a<,b<a4+,c¢<,b+4 ,¢. 
(b) If0,<,c, then 
a<,b => ayzc<zb-7¢. 
Proof Assume that a, b, and c are [<m, n>], [<p, g>], and [<r, sò], 


respectively. The result to be proved in part (a) then translates to the 
following statement about natural numbers: 


m+qeptn > m+r+q+sep+r+n+s. 


This is an immediate consequence of the fact that addition in œ preserves 
order (Theorem 4N). 
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Part (b) is similar in spirit. As in Theorem 4N, it suffices to prove one 
direction: 
0,<,c&a<,b = a'ze<zb'ze 
This translates to: 
ser&m+qep+n => mr+ns+ps+qre pr+qs+ms +n. 


This is not as bad as it looks. If we let k =m +q and l= p + n, then it 
becomes 


ser&kel => kr+lIlseks+lr. 
This is just Exercise 25 of Chapter 4. 4 
Corollary 5ZK For any integers a, b, and c the cancellation laws hold: 
a+,c=b+,c > a=b, 
av,c=b-c&c#0, > a=b. 


Proof This follows from the preceding theorem in the same way that 
the cancellation laws in w (Corollary 4P) followed from the order- 
preserving properties (Theorem 4N). 4 


Although w is not actually a subset of Z, nonetheless Z has a subset 
that is “just like” œ. To make this precise, define the function E: œ > Z by 


E(n) = [<n, 0>]. 


For example, E(0) = 0, and E(1) = 1,. 

The following theorem, in algebraic terminology, says that E is an 
“isomorphic embedding” of the system <œ, +, °, €,> into the system 
<Z, +z, `z» <z) That is, E is a one-to-one function that preserves 
addition, multiplication, and order. 


Theorem 5ZL E maps w one-to-one into Z, and satisfies the following 
properties for any natural numbers m and n: 


(a) E(m+n) = E(m) +, E(n). 
(b) E(mn) = Em) `z E(n). 
(c) meniff E(m) <, E(n). 


Proof To show that E is one-to-one we calculate 
E(m) = E(n) => [<m, 0>] = [<n, 09] 
= <m,0> ~ <n, 0> 
=> m=n. 


Parts (a), (b), and (c) are proved by routine calculations (Exercise 8). 4 
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Finally we can give a precise counterpart to our motivating guideline 
that the difference <m, n> should name m — n. For any m and n, 


[<m, n>] = E(m) — E(m) 


as is verified by evaluating the right side of this equation (Exercise 9). 

Henceforth we will streamline our notation by omitting the subscript 
“Z” on +2525 <z,0z, 12, etc. Furthermore a:b will usually be written 
as just ab. 


Exercises 
1. Is there a function F: Z > Z satisfying the equation 
F([<m, n>]) = [Km + n, n>]? 
2. Is there a function G: Z > Z satisfying the equation 
G([<m, n>]) = [<m, m>]? 
3. Is there a function H: Z > Z satisfying the equation 
H([Km, n>]) = Kn, m>]? 


4. Prove that +, is associative. (This is part of Theorem 5ZC.) 
5. Give a formula for subtraction of integers: 


[<m, n>] - Kp, o] =? 
6. Show that a -,0, = 0, for every integer a. 
7. Show that 


a-,(—b) = (—a) ‘zb = -(a'zb) 
for all integers a and b. 
8. Prove parts (a), (b), and (c) of Theorem 5ZL. 
9. Show that 
[<m, n>] = E(m) — E(n) 


for all natural numbers m and n. 


RATIONAL NUMBERS 


We can extend our set Z of integers to the set Q of rational numbers in 
much the same way as we extended w to Z. In fact, the extension from Z to Q 
is to multiplication what the extension from w to Z is to addition. In the 
integers we get additive inverses, i.e., solutions to the equation a + x = 0. 
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In the rationals we will get multiplicative inverses, i.e., solutions to the 
equation r+) x = lo (for nonzero r). 

We can name a rational number by using two integers and a symbol 

for division: 

1/2, —3/4, 6/12. 
But as before, each number has a multiplicity of names, e.g., 1/2 = 6/12. So 
the name “1/2” must be identified with the name “6/12.” 

By a fraction we mean an ordered pair of integers, the second com- 
ponent of which (called the denominator) is nonzero. For example, <1, 2> 
and <6, 12) are fractions; we want a suitable equivalence relation ~ for 
which <1, 2) ~ <6, 12). Since ajb = c/d iff a'd=c:b, we choose to 
define ~ as follows. Let Z’ be the set Z — {0} of nonzero integers. Then 
Z x Z is the set of all fractions. 


Definition Define ~ to be the binary relation on Z x Z’ for which 
<la, bò ~ <c, d> iff a-d=c-b. 


The set Q of rational numbers is the set (Z x Z')/~ of all equivalence classes 
of fractions. 


We use the same symbol “~” that has been used previously for other 
equivalence relations, but as we discuss only one equivalence relation at 
a time, no confusion should result. 

For example, <1, 2) ~ <6, 12> since 1-12 = 6: 2. The equivalence class 
[<1, 2>] is the rational number “one-half.” The rationals zero and one are 


0,=[<0,1>] and 19 =[K1, 1]. 


These are distinct, because <0, 1) # <1, 15. Of course we must check that ~ 
is indeed an equivalence relation. 


Theorem 5QA The relation ~ is an equivalence relation on Z x Z’. 


Proof You should verify that the relation is reflexive on Z x Z’ and is 
symmetric. As for transitivity, suppose that 


<a, b> ~ <c, dò and <c, d ~ <e, f>. 


Then 
ad = cb and cf = ed. 


Multiply the first equation by f and the second by b to get 
adf = cbf and cfb = edb. 


From this we conclude that adf = edb and hence (by canceling the nonzero 
d) af = eb. This tells us that <a, b> ~ <e, f>. 4 
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We can picture the equivalence classes as nonhorizontal lines (in the 
“plane” Z x Z’) through the origin (Fig. 22). The fraction <1, 2> lies on the 
line with slope 2; in general, [<a, b>] is the line with slope b/a. 

We arrive at addition and multiplication operations for Q by the same 
methods used for Z. For addition, the informal calculation 


indicates that + o should be defined by the equation 


[<a, b>] +, [<c, d>] = [Kad + cb, bd]. 


Note that bd # 0 since b # 0 and d # 0. Hence <ad + cb, bd) is a fraction. 
As usual, we must check that there is a well-defined function +, on 
equivalence classes that satisfies the above equation. The following lemma, 
together with Theorem 3Q, does just that. 


1/2 


Fig. 22. Rational numbers are nonhorizontal lines. 
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Lemma 5QB If <a, b> ~ <a’, b’> and <c, d> ~ <c', dD, then 
<ad + cb, bd ~ <a'd' + c’b’, b'd'). 
Proof We are given the equations 
ab’ = a'b and cd’ = cd. 
We want the equation 
(ad + cb)b'd' = (a'd’ + c'b’)bd, 
which, when expanded (with the factors in alphabetic order), becomes 


ab'dd' + bb'cd' = a’bdd’ + bb'c'd. 


This clearly is obtainable from the given equations. 4 


Example Just to be on the safe side, we will check that 2 + 2 = 4 in Q. 
Let 2, = [<2, 1>] and 4) = [<4, 1>]. Then 


29 +929 = [<2, 1>] + 9 [K2 1>] 
= [K2 + 2, 1>] 
=[<4, 1)] = 4> 
where we use the fact that 2 + 2 = 4 in Z. 
The rationals with + o and 0, also form an Abelian group: 
Theorem 5QC (a) Addition +, is associative and commutative: 
(q+or)+o5=4 +o(r +95), 
r+ gs=5 + of. 
(b) 0, is an identity element for +9: 
r +99, =r 
for any r in Q. 
(c) Additive inverses exist: For any r in Q there is an s in Q such 
that r +955 0. 
Q 
Proof First we verify commutativity. On the one hand, 
[<a, b>] + K6 d>] = [<ad + cb, bd], 


‘and on the other 
[Ke, d>] +; [<a, b>] = [<eb + ad, db)]. 
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But the right sides of these two equations are equal, by known commutative 
laws for arithmetic in Z. 

The verification of associativity is similar. Consider three rational 
numbers [<a, b>], [<c, dy], and [<e, fY]. Then one grouping for the sum is 


([<a, b>] +o [Xe d>]) +o [Ke f>] = [Kad + cb, bd>] + Xe. f>] 
= [<(ad + cb) f + e(bd), (ba) f >] 
= [<adf + cbf + ebd, bdf >]. 


The same expansion for the other grouping is 
[<a, b>] +g (dc, d>] + 9 [Xe f 91) = (Ka, bX] +g [Kef + ed, af] 
= [<a(df) + (cf + ed)b, b(df)>] 
= [<adf + cfb + edb, bdf >}, 
which agrees with the first calculation. 
Part (b) is a routine calculation. We know that r = [<a, b>] for some 
integers a and b. Then 
r +o% = Ka, b>) +o [<0, 1>] 
=[<a:1+0:b,b-1>] 
= [<a, b>] =r. 
Finally for part (c) we select (with r as above) s = [< —a, b)]. Then it is 
easy to calculate that 
r +35 = [Ka, b>] + [<—a, b>] 
= [<ab + (—a)b, bby] 
= [<0, bb>] = 0: 
since <0, bbò ~ <0, 1. 4 
As in any Abelian group, the inverse of r is unique; we denote it as —r. 
The above proof shows that —[<a, b>] = [< —a, bY]. 


For rational numbers, multiplication is simpler than addition. The 
informal calculation 


acae 
bd bd 
indicates that o should be defined by the equation 
[<a, b>] 0 [<c, a>] = [Kac, bd)]. 


(Notice the close analogy with +,.) This multiplication function is well 
defined, as the following lemma verifies. 
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Lemma 5QD If <a, b> ~ <a’, b’> and <c, dò ~ <c’, dD, then 
<ac, bd> ~ <a'c’, b’d’). 
Proof The proofis exactly as in Lemma 5ZB, but with addition replaced 
by multiplication. 4 


Example Recall that 1, =[<1, 1>]. We can now check that 1, is a 
multiplicative identity element, i.e., that r'o 1 gah We know that r= 
[<a, b>] for some a and b. Thus 

ry 1g = [Kas by] g KL 1] 
=[<a-1,b-1)] 
= [Ka, b>] 
=r. 


You should also verify that r +% 05 = 0). 


Theorem SQE Multiplication of rationals is associative, commutative, 
and distributive over addition: 
(p Q q) re) r=p Q (4 or) 
d'gl =o 
P'o (a +o") = (99) +o (P97): 

Proof The verification of associativity and commutativity is directly 
analogous to verification of the same properties for +,. 

We will proceed to prove the distributive law. We know that we can 
write p = [<a, b>], q = [Xc, d>], and r = [<e, f >] for some integers a, b, c, d, e, 
and f. Then 

Pg (r +95) = Ka, b>] g (xe, d>] + Ke f>] 
= [<a, bY] ` [<ef + ed, df] 
= [<acf + aed, bdf X). 
On the other side of the expected equation we have 
(0-97) +o lpg S) = (da, bY] g Ke, d]) +9 (Ka, b>] 9 Kef 


= [<ac, bd>] + 9 [<ae, of >] 
= [acbf + aebd, bdbf X]. 


This agrees with the first calculation because <i, j> ~ <bi, bj>. 4 


The new property the rationals have (and that integers lack) is the 
existence of multiplicative inverses. 
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Theorem 5QF For every nonzero r in Q there is a nonzero q in Q such 
that rod= lo: 


Proof The given r must be of the form [<a, b>], where a #0, lest 
r=0,.Letq= [<b, a>]. Then q #0) and r-9q= [<ab, ab>] = Ig. 4 


We can use the existence of multiplicative inverses to show that there are _ 
no zero divisors in Q: 


Corollary 5QG_ Ifr and s are nonzero rational numbers, then r: g5 ÍS 
also nonzero. 


Proof The preceding theorem provides us with rationals r’ and s’ for 
which rol =S YSF 1: Hence 
oa gas) lo 
by using commutative and associative laws. But this implies that r gS F 0, , 
because 0, `g (r o s‘)=0) 41). 4 


We can restate this corollary by saying that the set of nonzero rational 
numbers is closed under multiplication; i.e., the product of numbers in this 
set is again in this set. 

As a result of the foregoing theorems, we can assert that the nonzero 
rationals with multiplication form an Abelian group. That is, multiplication 
gives us a binary operation on the nonzero rationals that is associative and 
commutative, we have an identity element 1, , and we have multiplicative 
inverses. As in any Abelian group, the inverse of r is unique; we denote it 
as r` !. The proof of Theorem 5QF shows that 


[<a, b>] * = [<b, a>}. 


Inverses provide us with a division operation. For a nonzero rational 
r we can define 

pon = . -i 

s+r=Es ori 


Then we have 


<e, dy] + [<a, b>] = [<e, d>] “9 <b, a>] 
= [<cb, daJ], 


a version of the “invert and multiply” rule for division of fractions. 

The algebraic concept exemplified by the rational numbers is the concept 
of a field. To say that <Q, to'g’ 02> 1,» is a field means that it is an 
integral domain with the further property that multiplicative inverses exist. 
(Other examples of fields are provided by the real numbers and by the 
complex numbers.) The method we have used to extend from Z to Q can be 
applied to extend any integral domain to a field. 
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Next we want to define the ordering relation for the rational numbers. 
The informal calculation 


c 
- iff ad< cb 

d 

is correct if b and d are positive. There is no guarantee that denominators 
are always positive. But because 


[<a, b>] = [<—-a, — b>], 


every rational number can be represented by some fraction with a positive 
denominator. (Recall that for nonzero b, either b or —b is positive.) The 
above informal calculation then suggests that we define < so that 


[Ka, b>] <g lke, dy) iff ad<cb 


whenever b and d are positive. As with <,, we must verify that this 
condition yields a well-defined relation. The following lemma accomplishes 
the verification. 


Lemma 5QH Assume that <a, by ~ <a’, b> and <e, d> ~ <c, d’). 
Further assume that b, b’, d, and d’ are all positive. Then. . 
ad < cb iff ad < c'b. 
Proof The proof is the same as the proof of Lemma 5ZH, but with 
multiplication of integers in place of addition of naturai numbers. 4 


This lemma guarantees that when we test to see whether or not r < o% 
it does not matter which fractions with positive denominators we choose 
from r and s. 


Example To check that 09 < g 1g, we choose fractions <0, 1> € 0g and 
<1, 1>e lọ- Then since 0: 1 < 1-1, we do indeed have 09 < elo But we 
could also have chosen fractions <0, 4> € 0g and <3, 3>€ lọ. Then since 
0:-3<3:4, we again find, in consistency with the first calculation, that 
09 < ọlọ- 

Theorem 5QI The relation <% is a linear ordering on Q. 


Proof The proof is the same as the proof of Theorem 5ZI, with multi- 
plication in place of addition. For example, to prove trichotomy, we consider 
rational numbers r and s. For suitable integers we can write 


r= [<a, b>] and s = [<c, dò], 
where b and d are positive. Then trichotomy for Z tells us that exactly one of 


ad < cb, ad = cb, cb < ad 
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holds, whence exactly one of 


r<o5, r=5S, s<ur 


holds. . J 


One can check that r < „0, iff 0) <, —r (Exercise 12). Call q positive 
Q Q go 
iff 0, <, q. Then as a consequence of trichotomy, we have the fact for any 
rational number r, exactly one of the three alternatives 


r is positive, r is zero, —r is positive 
holds. We can define the absolute value |r| of r by 


|r| = j-r if —r is positive, 
|r otherwise. 


Then 0 o Sol” r| for every r. 
Next we prove that order is preserved by addition and by multiplication 
by a positive factor. 


Theorem 5QJ Let r, s, and t be rational numbers. 


(a) r< ,siffr+, t< Sto 
(b) Ift is positive, then 


r<os iff hot SoS gt 


Proof Part (b) has the same proof as part (a) of Theorem 5ZJ, but with 
multiplication in place of addition. To prove part (a), we first write r, s, and 
t in the form 


r = [<a, b>], s = [<c, dò], t=[<e,f>], 


where b, d, and f are positive. Since t is a positive rational, e is also a 
positive integer. Then 


r+ot<pst ot = [Kaf + eb, bf >] < o [<ef + ed, df >] 
<> (af + eb)df < (cf + ed)bf 
= adff + bdef < beff + bdef 
= ad<be by Theorem 5ZJ 
<| r <o S 
as desired. 1 


We have already said that the rational numbers form a field; the two 


preceding theorems state that <Q, +o g 02> lo ,< 2? is an ordered field. 
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Theorem 5QK The following cancellation laws hold for any rational 
numbers. 


(a) Ifr+)t=s+ot, thenr=s. 
(b) Ifr-)¢=s-gt and t is nonzero, then r = s. 


Proof We can prove this as a corollary of the preceding theorem, 
following our past pattern. But there is now a simpler option open to us. 
In part (a) we add —t to both sides of the given equation, and in part (b) 
we multiply both sides of the given equation by t~*. (This proof works 
in any Abelian group.) 


Finally, we want to show that, although Z is not a subset of Q, 
nevertheless Q has a subset that is “just like” Z. Define the embedding 
function E: Z > Q by 


E(a) = [Ka, 1>]. 


This functions gives us an isomorphic embedding in the sense that the 
following theorem holds. 


Theorem 5QL £ is a one-to-one function from Z into Q satisfying the 
following conditions: 

(a) E(a + b) = E(a) +9 E{b). 

(b) E(ab) = E(a) ‘g E(b). 

(c) E(0) = 0, and E(1) = 1g. 

(d) a< biff E(a) <g E(b). 


Proof Each part of the theorem can be proved by direct calculation. 
First we check that E is one-to-one: 


E(a)= E(b) = [<a, 1>] = [Kb, 1>] 
= <a, 1> ~<b,1> 


=> a=b. 


Parts (a), (b), and (d) are proved by the following calculations: 


Efa) + E(b) = [Ka 1] +ib D] 
= [Ka+b,1>] 
= E(a+b), 

Efa) Eb) = (Xa, 1] 9 Kb, 19] 
= [<ab, 1>] 


E(ab), 
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E(a) < Bb) <> [<a 1)] <, [<b, 19] 
< a:l<b:l 
< a<b., 
Finally part (c) is a restatement of the definitions of 0, and 1). 4 
We also obtain the following relation between fractions and division: 
[<a, b>] = E(a) + E(b). 


Since b # 0, we have E(b) + 0,, and so the indicated division is possible. 
Henceforth we will simplify the notation by omitting the subscript “Q” 

on +9, g’ 00> and so forth. Also the product r-s will usually be written 

as just rs. 


Exercises 


10. Show that r o 0, = 05 for every rational number r. 


11. Give a direct proof (not using Theorem 5QF) that if r ‘gS = 0p, then 
either r = 0, ors=0 


12. Show that 


o 
r <o% iff 0, <o T" 


13. Give a new proof of the cancellation law for +, (Corollary 5ZK(a)), 
using Theorem 5ZD instead of Theorem 5ZJ. 


14. Show that the ordering of rationals is dense, i.e., between any two 
rationals there is a third one: 


P<gs = (I)(p <o" <95) 


REAL NUMBERS 


The last number system that we will consider involves the set R of all 
real numbers. The ancient Pythagoreans discovered, to their dismay, that 
there was a need to go beyond the rational numbers. They found that there 
simply was no rational number to measure the length of the hypotenuse 
of a right triangle whose other two sides had unit length. 

In our previous extensions of number systems, we relied on the facts 
that an integer could be named by a pair of natural numbers, and a 
rational number could be named by a pair of integers. But we cannot 
hope to name real number by a pair of rationals, because, as we will 
prove in Chapter 6, there are too many real numbers and not enough pairs 
of rationals. Hence we must look at new techniques in searching for a way 
to name real numbers. 
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Actually there are several methods that can be used successfully to 
construct the real numbers. One approach is to utilize decimal expansions, 
so that a real number is determined by an integer and an infinite sequence 
of digits (a function from œ into 10). This approach may be found in 
Claude Burrill’s book, Foundations of Real Numbers, McGraw-Hill, 1967. 

A more common method of constructing a suitable set R is to utilize 
the fact that a real number can be named by giving a sequence of rationals 
(a function from œ into Q) converging to it. So one can take the set of 
all convergent sequences and then divide out by an equivalence relation 
(where two sequences are equivalent iff they converge to the same limit). 
But there is one hitch: The concepts of “convergent” and “equivalent” 
must be defined without reference to the real number to which the sequence 
is converging. This can be done by a technique named after Cauchy. 

Define a Cauchy sequence to be a function s: œ > Q such that |s,, — s,| 
is arbitrarily small for all sufficiently large m and n; i.e., 


(V positive ¢ in Q)(3k € w)(Vm > k)(Vn > k) |s,, — 5, < E 


(Here we write s, in place of s(n), as usual for w-sequences.) The concept 
of a Cauchy sequence is useful here because of the theorem of calculus 
asserting that a sequence is convergent iff it is a Cauchy sequence. 

Let C be the set of all Cauchy sequences. For r and s in C, we define 
r and s to be equivalent (r ~ s) iff |r, —s,| is arbitrarily small for all 
sufficiently large n; i.e., 


(Y positive ¢ in Q)(3k € w)(¥n > k) |r, — s,| < £ 


Then the quotient set C/~ is a suitable candidate for R. (This approach to 
constructing R is due to Cantor.) 

An alternative construction of R uses so-called Dedekind cuts. This is 
the method we follow henceforth in this section. The Cauchy sequence 
construction and the Dedekind cut construction each have their own 
advantages. The Dedekind cut construction of R has an initial advantage 
of simplicity, in that it provides a simple definition of R and its ordering. 
But multiplication of Dedekind cuts is awkward, and verification of the 
properties of multiplication is a tedious business. The Cauchy sequence 
construction of R also has the advantage of generality, since it can be used 
with an arbitrary metric space in place of Q. 

With these considerations in mind, we choose the following strategy. 
We will present the Dedekind cut construction, and will prove that least 
upper bounds exist in R. (This is the property that distinguishes R from 
the other ordered fields.) Although we will define addition and multiplication 
of real numbers, we will not give complete verification of the algebraic 
properties. The Cauchy sequence construction may be found, among other 
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places, in Norman Hamilton and Joseph Landin’s book, Set Theory and 
the Structure of Arithmetic, Allyn and Bacon, 1961. 

The idea behind Dedekind cuts is that a real number x can be named 
by giving an infinite set of rationals, namely all the rationals less than x. 
We will in effect define x to be the set of all rationals smaller than x. 
To avoid circularity in the definition, we must be able to characterize the 
sets of rationals obtainable in this way. The following definition does the 
job. 


Definition A Dedekind cut is a subset x of Q such that: 
(a) D#¥x#Q. 


(b) x is “closed downward,” i.e., 
qex&r<q => rex. 
(c) x has no largest member. 


We then define the set R of real numbers to be the set of all Dedekind 
cuts. Note that there is no equivalence relation here; a real (i.e. a real 
number) is a cut, not an equivalence class of cuts. 

The ordering on R is particularly simple. For x and y in R, define 

x<,py iff xcy. 
In other words, <, is the relation of being a proper subset: 
<= Kx ERX R[x y. 

Theorem SRA The relation <, is a linear ordering on R. 

Proof The relation is clearly transitive; we must show that it satisfies 
trichotomy on R. So consider any x and y in R. Obviously at most one 
of the alternatives, 

xc y, x=y, yor, 


can hold, but we must prove that at least one holds. Suppose that the 
first two fail, i.e., that x ¢ y. We must prove that y € x. 

Since x $ y there is some rational r in the relative complement x — y 
(see Fig. 23). Consider any q € y. If r < q, then since y is closed downward, 


Fig. 23. The proof of Theorem SRA. 
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we would have re y. But r ¢ y, so we must have q <r. Since x is closed 
downward, it follows that qe x. Since q was arbitrary (and x # y), we 
have yc x. 4 


Now consider a set A of reals;.a real number x is said to be an 
upper bound of A iff y<,x for every y in A. The number x itself might 
or might not belong to A. The set A is bounded (i.e., bounded above) iff there 
exists some upper bound of A. A least upper bound of A is an upper bound 
that is less than any other upper bound. 


First consider an example not in R, but in Q. The set 
{fre Q|r-r <2} 


of rationals whose square is less than 2 is a bounded set of rationals that 
has no least upper bound in Q. (We are stating this, not proving it, but it 


follows from the fact that J2 is irrational.) The following theorem shows 
that examples of this sort cannot be found in R. 


Theorem SRB Any bounded nonempty subset of R has a least upper 
bound in R. 


Proof Let A be the set of real numbers in question. We will show that 
the least upper bound is just (_) A. 

Simply by the definition of (JA, we have xs |]JA for all xe A. 
Furthermore let z be any upper bound for A, so that x € z for all xe A. 
It then follows that (JA Sz; compare Exercise 5 of Chapter 2. The 
argument so far is not tied to R; we have only shown that |_)A is the least 
upper bound of the set A with respect to ordering by inclusion. 

What remains to be shown is that (JA e R. Because A is nonempty, 
it is easy to see that | JA # Ø. Also | JA # Q because |) A © z where z is 
an upper bound for A. You can easily verify (Exercise 15) that (JA is 
closed downward and has no largest element. 4 


The foregoing theorem is important in mathematical analysis. For 
example, it is needed in order to prove that a continuous function on a 
closed interval assumes a maximum. And this in turn is used to prove the 
mean value theorem of calculus. 

The addition operation for R is easily defined from addition of rationals. 
For reals x and y, define: 


x+py={qtri[qex&rey}. 


Lemma 5RC_ For real numbers x and y, the sum x +, y is also in R. 
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Proof Clearly x+,y is a nonempty subset of Q. To show that 
x +p Y # Q, choose some q’ in Q — x and r’ in Q — y. Then 
qex&rey > q<qg&r<r 
=> gtr<gt+r 


so that any member q +r of x +, y is strictly less than q' +r’. Hence 
q +r x+y- 
To show that x +, y is closed downward, consider any 


p<qtrextp,y 


(where ge x and rey). Then adding —q to both sides of the inequality, 
we have p — q <r. Since y is closed downward, we have p — q € y. Thus 
we can write p as the sum 


p=q+(p—4q) 


of q from x and p—q from y; this is what we need to have pe x +p- 
(Note: Here “p — q” refers to subtraction of rationals, p + (— q). Earlier 
in this proof “Q — x” referred to the relative complement of x in Q. If this 
sort of thing happened often, we would use a different symbol “Q \ x” for 
complements. But in fact the opportunities for confusion will be rare.) 

We leave it to you to verify that x + ay has no largest member 
(Exercise 16). 4 


Theorem SRD Addition of real numbers is associative and commuta- 
tive: 

(x +RY) +rZ=XxX +pg(Y +rZ)} 
X+pV=VtRX. 

Proof Since addition of rationals is commutative, it is clear from the 
definition of +, that it is commutative as well. As for associativity, we 
have 

(x+py)tapz={str[sextpy&rez} 
={(p+q)+r|pex&qey&rez}, 
and a similar calculation applies to the other grouping. Thus associativity 
of +, follows from associativity of addition of rationals. 4 


The zero element of R is defined to be the set of negative rational 
numbers: 


0, ={reQ|r <0}. 
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Theorem 5RE (a) 0, is a real number. 

(b) For any x in R, we have x +20, = x. 

Proof (a) It is easy to see that Ø #0, # Q; for example, — 1 € 0, 
and 1 ¢0,. And it is clear that 0, is closed downward. The fact that Op 
has no largest member follows immediately from the density of the rationals 
(Exercise 14). 

For part (b), we must prove that 


{fr+s|rex&s<Ol=x. 


The “<” inclusion holds because x is closed downward. To prove the 
“>” half, consider any p in x. Since x has no largest member, there is some 
r with p < rex. Let s = p — r. Then s < 0 and p =r + s € 0g. Hence both 
inclusions hold. 4 


0 
Oe 
0 
(reQ|-rex} —————— MS 
0 
Oe 


{re Q|-r¢€x!} 


Fig. 24. Sometimes — x is {re Q | —r ¢ x}, but not always. 


r s 0 -=s 
——— te —_ hr k& 
1 uac 


(b) 


Fig. 25. In (a) x is negative; in (b) x is positive. 
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Before we can conclude that the real numbers form an Abelian group 
with +, and 0,, we must prove that additive inverses exist. First we need 
to say just what set —x should be (where xe R). We think of the real 
number -- x as the set of all smaller rational numbers. If we draw a picture 
as in Fig. 24, we might be tempted to think that — x is the complement of 
{re Q| —rex}, or in other words, that —x should be {re Q| —r¢ x}. 
This choice is not quite right, because it may have a largest element. 
Instead we define 


=x = fre Q | (Is > r) -s ¢ x}. 
In Fig. 25, — x is illustrated as a subset of the rational number line. 


Theorem 5RF For every x in R: 


(a) -xeR, 
(b) x+pg(—x)= 0p- 


Proof To prove that —x is a real number, we first must show that 
Ø # —x #Q. There is some rational t with t ¢ x; let r= —t — 1. Then 
re—x because r< —t and —(-t)¢x. Hence —x # @. To show that 
—x # Q, take any pex. We claim that —p ¢ —x. This holds because if 
s > —p, then —s < p € x, whence —s € x. Hence —p¢ — x and so —x # Q. 

It is easier to show that —x is closed downward. Suppose that 
q <re -—x. Then (3s > r) —s ¢ x. Consequently (4s > q) —s ¢ x, since the 
same s can be used. Hence q€ — x. 

It remains to show that —x has no largest element. Consider then any 
element r in —x. We know that for some s >r we have —s ¢ x. Because 
the rationals are densely ordered (Exercise 14), there is some p with 
s >p >r. Then pe —x, and p is larger than r. This completes the proof 
that —x is a real number. 

Now we turn to part (b). By definition 


x+p(-x)={qtr|qex & (As>r)—s¢ xt. 


For any such member q +r of this set, we have r<s and q< —s 
(lest —s < q € x). Hence by the order-preserving property of addition, 


qgtr<(-s)+s=0. 


This shows that x +p (—x) = Ok- 

To establish the other inclusion, consider any p in 0,. Then p < 0, and 
so —p is positive. By Exercise 19, there is some qex for which 
q+(—p+2)¢x. Let s= (p + 2)—q, so that —s ¢ x. Then p is the sum 
of q (which is in x) and p — q (which is less than s, where —s ¢ x). This 
makes p a member of x + ,(—x). Thus we have both inclusions. 4 
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We have now shown that <R, + pg» 0,> is an Abelian group. As in any 
Abelian group, the cancellation law holds: 


Corollary SRG For any real numbers, 
X+pZH=VteRZ > X=). 
Proof Simply add —z to both sides of the given equation. 4 
Next we prove that addition preserves order. 
Theorem 5RH_ For any real numbers, 
X< py Œ XFRZ<RY +pZ. 
Proof It is easy to prove that 
(1) XSpy > XtRZSgVtRY 
because this amounts to the statement that if x S y, then 
fg+s|qex&sezSsfr+s|rey&sez, 
which is obvious. And by Corollary 5RG we have 


(2) x#y > XtpzF#VtRZ, 
which together with (1) gives the “=>” half of the theorem. The “<=” half 
then follows by trichotomy (as in the proof to Theorem 4N). 4 


We can define the absolute value |x| of a real number x to be the 
larger of x and —x. Since our ordering is inclusion, the larger of the two 
is just their union. Thus our definition becomes 

|x] =x U- x. 
Then by Exercise 20, |x| is always nonnegative, i.¢., Og <p |x|. 

Consider now the definition of multiplication. For the product of x and 
y we cannot use {rs | r e x & se y} (in analogy to the definition of x +, y} 
because both x and y contain negative rationals of large magnitude. 
Instead we use the following variation on the above idea. , 

Definition (a) If x and y are nonnegative real numbers, then 

Xpy =OpUfrs|O<rex&0<se yh. 

(b) If x and y are both negative real numbers, then 

X ‘pV = |x| ‘aly: 

(c) Ifone of the real numbers x and y is negative and one is nonnegative, 

then 
x “py = —(|xl aly). 
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The facts we want to know about multiplication are gathered into the 
following theorem. Let 1, = {re Q |r < 1}. Clearly Og <p Ig- We will not 
give a proof for this theorem; a proof can be found in Appendix F of 
Number Systems and the Foundations of Analysis by Elliott Mendelson 
(Academic Press, 1973). 


Theorem 5RI For any real numbers, the following hold: 


(a) x-py isa real number. 

(b) Multiplication is associative, commutative, and distributive over 
addition. 

(c) Ok #1, andx-,1, =x. 

(d) For nonzero x there is a nonzero real number y with x -py=I,p. 

(e) Multiplication by a positive number preserves order: If0, < pg z, then 


X<Y FH XipZ<pV pz. 


The foregoing theorems show that, like the rationals, the reals (with 
+R `R» Og, lp, and <p) form an ordered field. But unlike the rationals, 
the reals have the least-upper-bound property (Theorem 5RB). An ordered 
field is said to be complete iff it has the least-upper-bound property. It 
can be shown that the reals, in a sense, yield the only complete ordered 
field. That is, any other complete ordered field is “just like” (or more 
precisely, is isomorphic to) the ordered field of real numbers. For an 
exact statement of this theorem and for its proof, see any of the books we 
have referred to in this section, or p. 110 of Andrew Gleason’s book, 
Fundamentals of Abstract Analysis, Addison-Wesley, 1966. 

The correct embedding function E from Q into R assigns to each 
rational number r the corresponding real number 


E(r) = {ge Q|q <r}, 
consisting of all smaller rationals. 


Theorem 5RJ E is a one-to-one function from Q into R satisfying the 
following conditions: 

(a) E(r+s) = E(r) +, E(s). 

(b) E(rs) = E(r) ‘r E(s). 

(c) E(0)=0, and E(1) = 1x- 

(d) r< siff E(r) < pg E(s). 


Proof First of all, we must show that, E(r) is a real number. Obviously 
E(r) is a set of rationals, and it is closed downward. Furthermore 
@ # E(r) # Q because r — 1 e E(r) and r ¢ E(r). E(r) has no largest element, 
because if q € E(r), then by Exercise 14 there is a larger element p with 
q <p <r. Hence E(r) is indeed a real number. 
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To show that E is one-to-one, suppose that r # s. Then one is less than 
the other; we may suppose that r< s. Then re E(s) whereas s ¢ E(s). 
Hence E(r) # E(s). 

Next let us prove part (d), because it is easy. If r< s, then clearly 
E(r) = E(s). The inclusion is proper since E is one-to-one. Thus 

r<s => E(r)c E(s). 
The converse follows from trichotomy. If E(r) = E(s), then we cannot have 
r = s nor s <r (lest E(s) c E(r)), so we must have r < s. 
For part (a), we have 


E(r) + g E(s) = {p + q | p € E(r) & q € E(s)} 
={p+q|p<r&q<s}. 

We must show that this is the same as E(r + s), i.e., that 

{p+q|p<r&q<s}={t|t<rt+s}. 
The “ € ” inclusion holds because by Theorem 5QJ, 

ptq<rt+q<res. 

To establish the “2” inclusion, suppose that t<r+s. Let è= 
(r +s — t) + 2; then e > 0. Define p = r — e and q = s — £. Then p <r and 
q<s and p+q=t. Thus we can represent £ as a sum in the desired 
form. Hence both inclusions hold. 


Finally, we omit the (awkward) proof of part (b), and part (c) is only a 
restatement of definitions. © 4 


Exercises 

15. In Theorem 5RB, show that (JA is closed downward and has no 
largest element. 

16. In Lemma 5RC, show that x +, y has no largest element. 


17. Assume that a is a positive integer. Show that for any integer b there 
is some k in œ with 
b < a: E(k). 

18. Assume that p is a positive rational number. Show that for any 
rational number r there is some k in œ with 

r < p` E(E(k)). 
(Here k is in œ, E(k) is the corresponding integer, and E(E(k)) is the 
corresponding rational.) 


19. Assume that p is a positive rational number. Show that for any real 
number x there is some rational q in x such that 


p+qéx. 
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20. Show that for any real number x, we have 0, < p} |x|. 
21. Show that if x < p y, then there is a rational number r with 


x <pE(r) <py. 


22. Assume that x e R. How do we know that |x| € R? 


SUMMARIES 


In this chapter we have given one way of constructing the real numbers 
as particular sets. Along the way, some concepts from abstract algebra 
have naturally arisen. For convenient reference, we have collected in the 
present section certain definitions that have played a key role in this chapter. 


Integers Let m, n, p, and q be natural numbers. 


[Km, n>] ~ [Kr] <= m+q=p+n, 
[<m, n>] +2 [Kp 4>] = [Km + p,n + q>] 
—[<m, n>] = [<n, m>], 
[<m, n>] z [<p a>] = [Kmp + ng, mp + np), 
[<m, n>] <,[<p, a>] > mt+qept+n, 
E(n) = [<n, 0)]. 
Rational numbers Let a, b, c, and d be integers with bd # 0. 
<a,b>~<c,d> < ad= cb, 
[<a, b>] +g [<c, d>] = [Kad + cb, bd], 
—[Xa, b>] = [<—a, b>}, 
[<a, b>] ʻo Ke, d>] = [Xac, bd>], 
[<a, b>] <o [<c,d>] < ad<cb, when band d are positive, 
E(a) = [<a, 1)]. 
Real numbers. A real number is a set x such that Ø c xc Q, x is 
closed downward, and x has no largest member. 
x <RY > xCcy, 
xt+epy={qtri|qex&rey}, 
—x = fre Q| (3s >r) -s ¢ x}, 
|x] =x v — x, 
Ix| riy] = OR U frs|O<re |x|&0<se |y, 
E(r) = {qe Q|q<ri. 


Next we turn to the definitions from abstract algebra that are relevant 
to the number systems in this chapter. 
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An Abelian group (in additive notation) is a triple? <A, +, 0) 
consisting of a set A, a binary operation + on A, and an element (“zero”) 
of A, such that the following conditions are met: 


1. + is associative and commutative. 
2. 0 is an identity element, i.e., x + 0 = x. 
3. Inverses exist, i.e., Vx Jy x + y = 0. 


An Abelian group (in multiplicative notation) is a triple <A, -, 1> 
consisting of a set A, a binary operation - on A, and an element 1 of A, 
such that the following conditions are met: 


1. - is associative and commutative. 
2. 1 is an identity element, i.e., x - 1 = x. 
3. Inverses exist, i.e., Vx Jy x` y = 1. 


This is, of course, the same as the preceding definition. 


A group has the same definition, except that we do not require that 
the binary operation be commutative. All of the groups that we have 
considered have, in fact, been Abelian groups. But some of our results 
(e.g., the uniqueness of inverses) are correct in any group, Abelian or not. 


A commutative ring with identity is a quintuple <D, +, -, 0, 1> consisting 
of a set D, binary operations + and - on D, and distinguished elements 
0 and 1 of D, such that the following conditions are met: 


1. <D, +, 0) is an Abelian group. 

2. The operation - is associative and commutative, and is distributive 
over addition. 

3. 1 is a multiplicative identity (x 1 = x) and 0 # 1. 


An integral domain is a commutative ring with identity with the 
additional property that there are no zero divisors: 


4. Ifx#0 and y #0, then also x- y #0. 


A field is a commutative ring with identity in which multiplicative 
inverses exist: 


4’. If x isa nonzero element of D, then x- y = 1 for some y. 


Any field is also an integral domain, because condition 4’ implies condition 
4 (see the proof to Corollary 5QG). 


? It is also possible to define a group to be a pair <A, +), since the zero element 
turns out to be uniquely determined. We have formulated these definitions to match the 
exposition in this chapter. 
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An ordered field is a sextuple <D, +, -, 0, 1, <> such that the following 
conditions are met: 


1. <D, +,°, 0, 1> is a field. 
2. < isa linear ordering on D. 
3. Order is preserved by addition and by multiplication by a positive 
element: 
x<yp & X+Z<y+Z. 
If 0 < z, then 
x<p 2 x-z<yrz 


We can define ordered integral domain or even ordered commutative ring 
with identity by adjusting the first condition. A complete ordered field is an 
ordered field in which for every bounded nonempty subset of D there is a 
least upper bound. 

The constructions in this chapter can be viewed as providing an existence 
proof for such fields. The conditions for a complete ordered field are not 
impossible to meet, for we have constructed a field meeting them. 


TWO 


What is a two? What are numbers? These are awkward questions; 
yet when we discuss numbers one might naively expect us to know what it 
is we are talking about. 

In the Real World, we do not encounter (directly) abstract objects such 
as numbers. Instead we find physical objects: a number of similar apples, 
a ruler, partially filled containers (Fig. 26). 


Fig. 26. A picture of the Real World. 


Somehow we manage to abstract from this physical environment the 
concept of numbers. Not in any precise sense, of course, but we feel inwardly 
that we know what numbers are, or at least some numbers like 2 and 3. 
And we have various mental images that we use when thinking about 
numbers (Fig. 27). 
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a 


Fig. 27. Pictures of mental images. 


Then at some point we acquire a mathematical education. This teaches 
us many formal manipulations of symbols (Fig. 28). It all seems fairly 
reasonable, but it can all be done without paying very much attention to 
what the symbols stand for. In fact computers can be programmed to 
carry out the manipulations without any understanding at all. 

Since numbers are abstract objects (as contrasted with physical objects) 
it might be helpful to consider first other abstractions. Take honesty, for 
example. Honesty is a property possessed by those people whose utterances 
are true sentences, who do not fudge on their income tax, and so forth. By 
way of imitation, we can try characterizing two as the property that is true 
of exactly those sets that, for some distinct x and y, have as members 
x and y and nothing else. A slight variation on this proposal would be to 
eliminate sets and characterize two as the property that is true of exactly 
those properties that, for some distinct x and y, are true of x and y and 
nothing else. (This proposal is due to Frege.) 


2 
17) 35 dx 


Fig. 28. Pictures of manipulated symbols. 
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If we define numbers in terms of properties, someone might ask us what 
a property is. And no matter how we define numbers, the procedure of 
defining one concept in terms of another cannot go on forever, producing 
an infinite regress of definitions. Eventually the procedure must be founded 
on a commonly agreed upon basis. Properties might form such a basis. 
For a mathematician, sets form a more workable basis. 

Actually there is a close connection between properties and sets. Define 
the extension of a property to be the set of all objects of which that 
property is true. If a couple of properties have the same extension (i.e., if they 
are true of exactly the same objects), are they in fact one and the same 
property? Is the property of being a prime number less than 10 the same 
as the property of being a solution to 


xt — 17x? + 101x? — 247x + 210 = 0? 


If you answer “yes,” then one says that you are thinking of properties 
extensionally, whereas if you answer “no,” then you are thinking of properties 
intensionally. Both alternatives are legitimate, as long as the choice of 
alternatives is made clear. Either way, sets are the extensions of properties. 
(All sets are obtainable in this way; the set x is the extension of the 
property of belonging to x.) 

We can recast Frege’s proposal in terms of sets as follows: Two is the 
set having as members exactly those sets that, for some distinct x and y, 
have as members x and y and nothing else. But in Zermelo—Fraenkel 
set theory, there is no such set. (For the number one, you should check 
Exercise 8 of Chapter 2.) Our response to this predicament was to select 
artificially one particular set {@, {@}} as a paradigm. Now {@, {@}} is very 
different from the property that is true of exactly those sets that, for some 
distinct x and y, have as members x and y and nothing else, but it serves 
as an adequate substitute. Then rapidly one thing led to another, until 
we had the complete ordered field of real numbers. And as we have mentioned, 
one complete ordered field is very much like any other. 

But let us back up a little. In mathematics there are two ways to introduce 
new objects: 


(i) The new objects might be defined in terms of other already known 
objects. 

(ii) The new objects can be introduced as primitive notions and axioms 
can be adopted to describe the notions. (This is not so much a way of 
answering foundational questions about the objects as it is a way of 
circumventing them.) 


In constructing real numbers as certain sets we have selected the first 
path. The axiomatic approach would regard the definition of a complete 
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ordered field as axioms concerning the real number system (as a primitive 
notion). On the other hand, for sets themselves we have followed (with 
stripes) the axiomatic method. 

But what about the Real World and those mental images? And the 
manipulated symbols? We want not just any old concept of number, 
but a concept that accurately reflects our experiences with apples and 
rulers and containers, and accurately mirrors our mental images. This is 
not a precise criterion, since it demands that a precise mathematical concept 
be compared with informal and intuitive ideas. And consequently the 
question whether our concept is indeed accurate must be evaluated on 
informal grounds. Throughout this chapter our formulation of definitions 
has been motivated ultimately by our intuitive ideas. Is there any way we 
could have gone wrong? 

Yes, we could have gone wrong. In seeking a number system applicable 
to problems dealing with physical objects and physical space, we might have 
been guided by erroneous ideas. There is always the possibility that lines in 
the Real World do not really resemble R. For example, over very short 
distances, space might be somehow quantized instead of being continuous. 
(Experimental evidence forced us to accept the fact that matter is quantized; 
experimental evidence has not yet forced us to accept similar ideas about 
space.) Or over very large distances, space might not be Euclidean (a 
possibility familiar to science-fiction buffs). In such events, mathematical 
theorems about R, while still true of R, would be less interesting, as they 
might be inapplicable to certain problems in the Real World. 

Mathematical concepts are useful in solving problems from the Real 
World to the extent that the concepts accurately reflect the essential features 
of those problems. The process of solving a problem mathematically has 
three parts (Fig. 29). We begin with a Real World problem. Then we need 
to model the original problem by a mathematical problem. This typically 
requires simplifying or idealizing some aspects of the original problem. 
(For example, we might decide to ignore air resistance or friction.) The 
middle step in the process consists of finding a mathematical solution to 
the mathematical problem. The final step is to interpret the mathematical 
solution in terms of the original problem. The middle step in this process is 
called “pure mathematics,” and the entire process is called “applied 
mathematics.” 

We have, for example, all been given problems such as: If Johnny has 
six pennies and steals eight more, how many does he have? We first convert 
this to the mathematical problem: 6 + 8 = ? Then by pure mathematics 
(addition, in this case) we obtain 14 as the solution. Finally, we decide 
that Johnny has fourteen pennies. 

The mathematical modeling of a Real World problem is not always this 
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mathematical pure mathematics mathematical 
__ Oo 


problem solution 


abstraction interpretation 


Real 
World 


problem 


Fig. 29. Applied mathematics. 


straightforward. When we try to interpret our mathematical solution in 
terms of the original problem, we might discover that it just does not fit. 
If we start with six blobs of water and add eight more blobs, we may 
end up with only four or five rather large puddles. This outcome does 
not shake our faith in arithmetic at all. It does show that we need to 
revise the model and try again (perhaps by measuring volume instead of 
counting blobs). From the vast array of mathematical concepts we must 
select those (if any!) that accurately model the essential feature of the 
problem to be solved. 


CHAPTER 6 


CARDINAL NUMBERS AND 
THE AXIOM OF CHOICE 


EQUINUMEROSITY 


We want to discuss the size of sets. Given two sets A and B, we want 
to consider such questions as: 


(a) Do A and B have the same size? 
(b) Does A have more elements than B? 


Now for finite sets, this is not very complicated. We can just count 
the elements in the two sets, and compare the resulting numbers. And if 
one of the sets is finite and the other is infinite, it seems conservative 
enough to say that the infinite set definitely has more elements than does 
the finite set. 

But now consider the case of two infinite sets. Our first need is for a 
definition: What exactly should “A has the same size as B” mean when 
A and B are infinite? After we select a reasonable definition, we can then 
ask, for example, whether any two infinite sets have the same size. (We have 
not yet officially defined “finite” or “infinite,” but we will soon be in a 
position to define these terms in a precise way.) 


An Analogy In order to find a solution to the above problem, we can 
first consider an analogous problem, but one on a very simple level. 
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o g e 8 


Fig. 30. Are there exactly as many houses as people? 


Imagine that your mathematical education is just beginning—that you are 
on your way to nursery school. You are apprehensive about going, because 
you have heard that they have mathematics lessons and you cannot count 
past three. Sure enough, on the very first day they show you Fig. 30 and 
ask, “Are there exactly as many houses as people?” Your heart sinks. 
There are too many houses and too many people for you to count. This is 
just the predicament described earlier, where we had sets A and B that, 
being infinite, had too many elements to count. 

But wait! All is not lost. You take your crayon and start pairing people 
with houses (Fig. 31). You soon discover that there are indeed exactly as 
many houses as people. And you did not have to count past three. You 
get a gold star and go home happy. We adopt the same solution. 


Fig. 31. How to answer the question without counting. 


Definition A set A is equinumerous to a set B (written A ~ B) iff there 
is a one-to-one function from A onto B. 


A one-to-one function from A onto B is called a one-to-one correspondence 
between A and B. For example, in Fig. 30 the set of houses is equinumerous 
to the set of people. A one-to-one correspondence between the sets is exhibited 
in Fig. 31. 


Example The set œw x œw is equinumerous to œ. There is a function J 
mapping œw x œw one-to-one onto w, shown in Fig. 32, where the value of 
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Fig. 32. © x w z w. 


J(m, n) is written at the point with coordinates <m, n>. In fact we can give 
a polynomial expression for J: 


J(m, n) = 3[(m + n} + 3m + n], 
as you are asked to verify in Exercise 2. 


Example The set of natural numbers is equinumerous to the set Q of 
rational numbers, i.e., «2 ~ Q. The method to be used here is like the one used 
in the preceding example. We arrange Q in an orderly pattern, then thread 
a path through the pattern, pairing natural numbers with the rationals as we 
go. The pattern is shown in Fig. 33. We define f: œw > Q, where f(n) is the 
rational next to the bracketed numeral for n in Fig. 33. To ensure that f 
is one-to-one, we skip rationals met for the second (or third or later) time. 
Thus f(4) = — 1, and we skip — 2/2, —3/3, and so forth. 


[0] [10] [11] 


eee [5] —2/1—e— 1/1 [4] oy ety p] 21 3/1 wee 
e.. 2/2 [3] =1/2 0/2 ~— 1/2 [2] 2/2 3/2 [12] eee 
[7] [8] t 
eee [6] —2/3 —— - 1/3 —— 03 — 13 — 2/3 [9] 3/3 eee 
[15] [14] 
eee —2/4 —e— — 1/4 = 0/4 e 1/4 e 2/4 ~e— 3/4 [13] oo 
Fig. 33. œ% = Q. 


Example The open unit interval 
(0,1)={xeR]0<x<1} 


is equinumerous to the set R of all real numbers. A geometric construction 
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of a one-to-one correspondence is shown in Fig. 34. Here (0, 1) has been 
bent into a semicircle with center P. Each point in (0, 1) is paired with its 
projection (from P) on the real line. 

There is also an analytical proof that (0, 1)% R. Let f(x)= 
tan x(2x — 1)/2. Then f maps (0, 1) one-to-one (and continuously) onto R. 


As the above example shows, it is quite possible for an infinite set, such 
as R, to be equinumerous to a proper subset of itself, such as (0, 1). 
(For finite sets this never happens, as we will prove shortly.) Galileo 
remarked in 1638 that œ was equinumerous to the set {0, 1, 4, 9, ...} of 
squares of natural numbers, and found this to be a curious fact. The 


P 


Fig. 34. (0, 1) = R. 


squares are in some sense a small part of the natural numbers, e.g., the 
fraction of the natural numbers less than n that are squares converges to 
0 as n tends to infinity. But when viewed simply as two abstract sets, the 
set of natural numbers and the set of squares have the same size. Similarly 
the set of even integers is equinumerous to the set of all integers. If we 
focus attention on the way in which even integers are placed among the 
others, then we are tempted to say that there are only half as many even 
integers as there are integers altogether. But if we instead view the two sets 
as two different abstract sets, then they have the same size. 


Example For any set A, we have PA x ^2. To prove this, we define 
a one-to-one function H from PA onto 42 as follows: For any subset B 
of A, H(B) is the characteristic function of B, i.e., the function f, from A 
into 2 for which 


1 if xe B, 
0 if xe A—B. 


Then any function g € 42 is in ran H, since 
g = H({x e A | g(x) = 1}). 


The following theorem shows that equinumerosity has the property of 
being reflexive (on the class of all sets), symmetric, and transitive. But it 


fi) = 
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cannot be represented by an equivalence relation, because it concerns 
all sets. In von Neumann-Bernays set theory, one can form the class 

E = {<A, B) | A % B}. 
Then E is an “equivalence relation on V,” in the sense that it is a class of 
ordered pairs that is reflexive on V, symmetric, and transitive. But E is not 
a set, lest its field V be a set. In Zermelo-Fraenkel set theory, we have 
only the “equivalence concept” of equinumerosity. 


Theorem 6A For any sets A, B, and C: 
(a) AZA. 

(b) If AB, then Bx A. 

(c) If Ax Band B x C, then AC. 


Proof See Exercise 5. 4 


In light of the examples presented up to now, you might well ask 
whether any two infinite sets are equinumerous. Such is not the case; some 
infinite sets are much larger than others. 


Theorem 6B (Cantor 1873) (a) The set œ is not equinumerous to 
the set R of real numbers. ` 
(b) No set is equinumerous to its power set. 


Proof We will show that for any function f: œ -> R, there is a real 
number z not belonging to ran f. Imagine a list of the successive values 
of f, expressed as infinite decimals: 


f (0) = 236.001..., 
fO) = -7.7717..., 
f(2)= 3.1415... 


(In Chapter 5 we did not go into the matter of decimal expansions, but 
you are surely familiar with them.) We will proceed to construct the real z. 
The integer part is 0, and the (n + 1)st decimal place of z is 7 unless the 
(n + 1)st decimal place of f (n) is 7, in which case the (n + 1)st decimal 


place of z is 6. For example, in the case shown, 

z = 0.767.... 
Then z is a real number not in ranf, as it differs from f(n) in the 
(n + 1)st decimal place. 


The proof of (b) is similar. Let g :A > AA; we will construct a subset 
B of A that is not in ran g. Specifically, let 


B= {xe A|x €Q(x)}. 
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Then B £ A, but for each x € A, 


xeB <> xé€g(x). 
Hence B # g(x). 4 
The set R happens to be equinumerous to Pw, as we will soon be able 
to prove. A larger set is AR, and PAR is larger still. 
Before continuing our consideration of infinite sets, we will study the 


other alternative: the sets that are “small” at least to the extent of being 
finite. 


Exercises 
1. Show that the equation 

f(m, n) = 2"(Q2n + 1)-1 
defines a one-to-one correspondence between w x œw and w. 
2. Show that in Fig. 32 we have: 

J(mn)=[14+24+---+(mt+n)]+m 
= 3[(m + n} + 3m + nl]. 

3. Find a one-to-one correspondence between the open unit interval (0, 1) 
and R that takes rationals to rationals and irrationals to irrationals. 
4. Construct a one-to-one correspondence between the closed unit interval 


[0, 1} ={xeR|O0<x<}B} 
and the open unit interval (0, 1). 
5. Prove Theorem 6A. 


FINITE SETS 


Although we have long been using the words “finite” and “infinite” in 
an informal way, we have not yet given them precise definitions. Now is 
the time. 


Definition A set is finite iff it is equinumerous to some natural number. 
Otherwise it is infinite. 


Here we rely on the fact that in our construction of œ, each natural 
number is the set of all smaller natural numbers. For example, any natural 
number is itself a finite set. 

We want to check that each finite set S is equinumerous to a unique 
number n. The number n can then be used as a count of the elements in S. 
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We first need the following theorem, which implies that if n objects are 
placed into fewer than n pigeonholes, then some pigeonhole receives more 
than one object. Recall that a set A is a proper subset of B if ACB 
and A # B. 


Pigeonhole Principle No natural number is equinumerous to a proper 
subset of itself. 


Proof Assume that f is a one-to-one function from the set n into the 
set n. We will show that ran f is all of the set n (and not a proper subset 
of n). This suffices to prove the theorem. 


el ———— Ld’ 


k k 


Fig. 35. In f we interchange two values to obtain f. 


We use induction on n. Define: 
T = {n € œ | any one-to-one function from n into n has range n}. 


Then 0€ T; the only function from the set 0 into the set 0 is @ and its 
range is the set 0. Suppose that ke T and that f is a one-to-one function 
from the set k* into the set k*. We must show that the range of f is all 
of the set kt; this will imply that k* e T. Note that the restriction 
f | k of f to the set k maps the set k one-to-one into the set kt. 

Case I Possibly the set k is closed under f. Then f | k maps the set 
k into the set k. Then because ke T we may conclude that ran (f [ k) 
is all of the set k. Since f is one-to-one, the only possible value for f(k) 
is the number k. Hence ran f is k u {k}, which is the set k*. 
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Case II Otherwise f (p) = k for some number p less than k. In this 
case we interchange two values of the function. Define f by 


Îi) = Fe), 
Fk) = S(p) =k, 
F(x) = f(x) for other xe k* 
(see Fig. 35). Then f maps the set k* one-to-one into the set k*, and 


the set k is closed under f. So by Case I, ran f = k*. But ran f = ran f. 
Thus in either case, ran f = k*. So T is inductive and equals w. 4 


=f 
=f 


Corollary 6C No finite set is equinumerous to a proper subset of itself. 


Proof This is the same as the pigeonhole principle, but for an arbitrary 
finite set A instead of a natural number. Since A is equinumerous to a 
natural number n, we can use the one-to-one correspondence g between A 
and n to “transfer” the pigeonhole principle to the set A. 

Suppose that, contrary to our hopes, there is a one-to-one correspond- 
ence f between A and some proper subset of A. Consider the composition 
g ° f° g7}, illustrated in Fig. 36. This composition maps n into n, and it is 
one-to-one by Exercise 17 of Chapter 3. Furthermore its range C is a 
proper subset of n. (Consider any a in A — ran f; then g(a) e n — C.) Thus 
n is equinumerous to C, in contradiction to the pigeonhole principle. 4 


The foregoing proof uses an argument that is useful elsewhere as well. 
We have sets A and n that are “alike” in that A xn, but different in 
that they have different members. Think of the members of A as being red, 
the members of n as being blue. Then the function g: An paints red 


A n 
C > - C > 
- 
f 
<E i EOD 
—____ > 
A n 


Fig. 36. What f does to A, g o f o g7! does to n. 
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things blue, and the function g™*:n— A paints blue things red. The 
composition g o f °g ' paints things red, applies f, and then restores the 
blue color. 


Corollary 6D (a) Any set equinumerous to a proper subset of itself is 
infinite. 
(b) The set o is infinite. 


Proof The preceding corollary proves part (a). Part (b) follows at 
once from part (a), since the function o whose value at each number n is n* 
maps w one-to-one onto œ — {0}. 4 


Corollary 6E Any finite set is equinumerous to a unique natural 
number. 


Proof Assume that A % m and A ~ n for natural numbers m and n. 
Then m ~ n. By trichotomy and Corollary 4M, either m = nor one is a proper 
subset of the other. But the latter alternative is impossible since m% n. 
Hence m = n. 4 


For a finite set A, the unique ne œ such that A%n is called the 
cardinal number of A, abbreviated card A. For example, card n=n for 
new. And if a, b, c, and d are all distinct objects, then card{a, b, c, d} = 4. 
This is because {a, b, c, d} % 4; selecting a one-to-one correspondence is the 
process called “counting.” Observe that for any finite sets A and B, we have 
A % card A and 

card A = card B iff AB. 


What about infinite sets? The number card A measures the size of a 
finite set A. We want “numbers” that similarly measure the size of infinite 
sets. Just what sets these “numbers” are is not too crucial, any more 
than it was crucial just what set the number 2 was. The essential demand 
is that we define card A for arbitrary A in such a way that 


card A = card B iff AB. 


Now it turns out that there is no way of defining card A that is really 
simple. We therefore postpone until Chapter 7 the actual definition of the 
set card A. The information we need for the present chapter is embodied 
in the following promise. 


Promise For any set A we will define a set card A in such a way that: 
(a) For any sets A and B, 
card A = card B iff AB. 


(b) Fora finite set A, card A is the natural number n for which A ~ n. 
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(In making good on this promise, we will use in Chapter 7 additional 
axioms, namely the replacement axioms and the axiom of choice. If you 
plan to omit Chapter 7, then regard card A as an additional primitive 
notion and the promise as an additional axiom.) 


We define a cardinal number to be something that is card A for some 
set A. By part (b) of the promise, any natural number n is also a cardinal 
number, since n = card n. But card @ is not a natural number (card œ # n = 
card n, since œ is not equinumerous to n). Just what set card œ is will not 
be revealed until Chapter 7. Meanwhile we give it the name that Cantor 
gave it: 


card œ = No. 


The symbol & is aleph, the first letter of the Hebrew alphabet. 

In general, for a cardinal number x, there will be a great many sets A 
of cardinality x, i.e., sets with card A = x. (The one exception to this occurs 
when x = 0.) In fact, for any nonzero cardinal x, the class 


K, = {X | card X = x} 


of sets of cardinality x is too large to be a set (Exercise 6). But all of the 
sets of cardinality x look, from a great distance, very much alike—the 
elements of two such sets may differ but the number of elements is always x. 
In particular, if one set X of cardinality x is finite, then all of them are; 
in this case x is a finite cardinal. And if not, then x is an infinite cardinal. 
Thus the finite cardinals are exactly the natural numbers. X, is an infinite 
cardinal, as are card R, card Pw, card PPw, etc. 

Before leaving this section on finite sets, we will verify a fact that, on an 
informal level, appears inevitable: Any subset of a finite set is finite. 
(Later we will find another proof of this.) 


Lemma 6F If C is a proper subset of a natural number n, then C = m 
for some m less than n. 


Proof We use induction. Let 
T = {ne œ | any proper subset of n is equinumerous to a member of n}. 


Then 0e T vacuously, 0 having no proper subsets. Assume that ke T and 
consider a proper subset C of k*. 


Case] C=k. ThenCxkek*. 


Case II C isa proper subset of k. Then since ke T, we have C % m 
for me kekt. 
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Case III Otherwise ke C. Then C = (C ^ k) vu {k} and Cok isa 
proper subset of k. Because k € T, there is me k with C ^ k = m. Say fis a 
one-to-one correspondence between C ^ k and m; then fu {<k, mò} is a 
one-to-one correspondence between C and m*. Since mek, we have 
m* ekt. Hence C = m* e k* andk* eT. 


Thus T is inductive and coincides with œ. 4 
Corollary 6G Any subset of a finite set is finite. 


Proof Consider AS B and let f be a one-to-one correspondence 
between B and some n in œw. Then Axf[A]&n and by the lemma 
f[A] = m for some men. Hence AX mene. 4 


Exercises 


6. Let k be a nonzero cardinal number. Show that there does not exist 
a set to which every set of cardinality x belongs. 


7. Assume that A is finite and f: A —> A. Show that f is one-to-one iff 
ran f= A. 

8. Prove that the union of two finite sets is finite (Corollary 6K), without 
any use of arithmetic. 


9, Prove that the Cartesian product of two finite sets is finite (Corollary 
6K), without any use of arithmetic. 


CARDINAL ARITHMETIC 


The operations of addition, multiplication, and exponentiation are well 
known to be useful for finite cardinals. The operations can be useful for 
arbitrary cardinals as well. To extend the concept of addition from the 
finite to the infinite case, we need a characterization of addition that is 
correct in the finite case, and is meaningful (and plausible) in the infinite 
case. In Chapter 4 we obtained addition on w by use of the recursion 
theorem. That approach is unsuitable here, so we seek another approach. 

The answer to our search lies in the way addition is actually explained 
in the elementary schools. First-graders are not told about the recursion 
theorem. Instead, if they want to add 2 and 3, they select two sets K and 
L with card K = 2nd card L = 3. Sets of fingers are handy; sets of apples 
are preferred by textbooks. Then they look at card(K u L). If they had the 
good sense to select K and L to be disjoint, then card(K U L) = 5. 

The same idea is embodied in the following definition of addition. In 
the same vein, we can include multiplication and exponentiation. 
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Definition Let x and å be any cardinal numbers. 


(a) «+A=card(K u L), where K and L are any disjoint sets of 
cardinality x and A, respectively. 

(b) «-A=card(K x L), where K and L are any sets of cardinality x 
and A, respectively. 

(c) x’ = card +K, where K and L are any sets of cardinality x and 4, 
respectively. 


In every case it is necessary to prove that the operation is well defined; 
that is, that the outcome is independent of the particular sets K and L 
selected to represent the cardinals. Also for the case of finite cardinals, we 
should check that the above definitions are not in open conflict with 
operations on œ defined in Chapter 4. 

First consider addition. To add two cardinals x and 4, the definition 
demands that we first select disjoint sets K and L with card K = x and 
card L = 4. This is possible; if our first choices for K and L fail to be 
disjoint, we can switch to K x {0} and L x {1}. For then, since 


K=Kx{0} and LeL~x {t, 


we have card(K x {0}) = x and card(L x {1}) = A. And the sets K x {0} and 
L x {1} are disjoint. 

Having selected disjoint representatives K and L for x and A, we form 
their union K u L. Then by definition 


K +A=card(K vu L). 


We must verify that this sum is independent of the particular disjoint sets 
K and L selected. This verification is accomplished by part (a) of the 
following theorem. For suppose K,, L, and K,, L, are two selections of 
disjoint sets of cardinality x, 4. Since card K, =card K, =k, we have 
K, % K,; similarly L, % L, . The following theorem then yields K ,VL,* 
K, U L,, whence 


card(K, U L,) = card(K, V L,). 


Thus we have an unambiguous sum x + A. Parts (b) and (c) of the theorem 
perform the same service for multiplication and exponentiation. 


s 
Theorem 6H Assume that K, ~ K, and L, % 


(a) If K, ^L =K,n^ L, =Ø, then K, u L x K,U L. 
(b) K, x Li 3K, x L. 
(c) wK, mwUdK,. 
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Proof Since K, % K,, there is a one-to-one function f from K, onto 
K,; since L, = L,, there is a one-to-one function g from L, onto L,. Then 
for (a), the function h, defined by 


h(x) = f(x) if xeK,, 
g(x) if xeL,, 
maps K, u L, one-to-one onto K, U L,. (We need K, NL, = Ø to be 


sure h is a function; we need K, ^ L, = Ø to be sure it is one-to-one.) 
For (b), the function h, defined by 


h(<x, y>) = < F(x), gly)> 


(for xe K, and ye L,), maps K, x L, one-to-one onto K, x L,. 


“| 


Fig. 37. H(j)=f ejeg™}. 


Finally for (c), the function H, defined by 
HG) =f- je gt, 
maps “K, one-to-one onto “2K, (see Fig. 37). For clearly H(j) is a 
function from L, to K,. To see that H is one-to-one, suppose that j and j 


are different functions from L, into K,. Then j(t) # j'(t) for some te L,. 
Then compute H(j) and H(j’) at g(t): 


HCl) =F GO) 4FU'O) = AVY) 
since f is one-to-one. Hence the function H(j) is different from the function 
H(j’), ie, H is one-to-one. And to see that ran H is all of “K,, 
consider any function d from L, into K,. Then d= H(j), where 
j=f ~ odog. 


Remark The proof of part (c) is our first proof dealing with cardinal 
exponentiation. Observe that it involves constructing a function H whose 
arguments and values are themselves functions. In such cases it is imperative 
to use a rationally chosen notation, e.g., an uppercase letter for H and 
lowercase letters for the arguments of H, as in the expression “H(j).” Here 
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the function H must be one-to-one. But the function j is in general not 
one-to-one, nor is H(j). It may help to think of H as a “super function” 
that assigns certain lower-level functions to other lower-level functions. The 
super function in this proof is one-to-one, but the lower-level functions are 
not necessarily one-to-one. 


Examples 1. 2+2=4. Proving this amounts to finding disjoint sets 
K and L with K = 2 and L 7% 2, and then checking that K U L ~ 4. (This is 
not the same as the verification of 2 + 2 = 4 in Chapter 4, because now 
we are using the addition operation for cardinal numbers. But Theorem 6J 
assures us that the answer is unchanged.) 

2. For mand nin œ, m:n = card m x n and m” = card "m. 

3. For any natural number n, 


n+ =o and n: No =No 
(unless n = 0). Also 
No + No =No and Ny: No =No. 
The last equation follows from an earlier example showing that œ x w ~ œ. 


To prove that 2 +N, =X, it suffices to show that {a, b} O œ% w 
(where a and b are not in œ). The function 


f(aj=0, f(b)=1, fàn)=n** 
establishes this. The other equations are left for you to check. Observe that 


cancellation laws fail for infinite cardinals, e.g., 2 + No = 3 + No, but 2 4 3. 
4. For any cardinal number x, 


K+0=k, x: 0=0, K'1L=k. 


5. Recall that °K = {Ø} for any set K and that §@ = Ø for nonempty 
K. In terms of cardinal numbers, these facts become 


Ko =1 ~~ foranykx, 
Oo“ =0 for any nonzero K. 


In particular, 0° = 1. 
6. For any set A, the cardinality of its power set is 2°44, This is 
because by the definition of exponentiation, 


card A — card(42). 
And we have shown that 42 ~ PA, and hence 
2eard A — card(42) = card PA. 


In particular, card Pw = 2%. (The term “power set” is rooted in the fact 
that card PA equals 2 raised to the power card A.) 
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7. By Cantor’s theorem and the preceding example, x # 2" for any x. 
In particular Nọ # 2%. 
8. For any cardinal x, 


K+K=2°Kk. 
You are invited to explain how this fact is obtained. 


The following theorem lists some of the elementary facts of cardinal 
arithmetic. (Parts of the theorem will later be seen to be rather trivial.) 


Theorem 61 For any cardinal numbers x, A, and u: 


K+tAs=A+K and K‘A=A-k. 

K+ (à +u)=(xk+å)+ uand k: (4; y)= (xå) u 
Ki’ Atpy=KA+K y. 

KATH = KA KH, 

(k A} = Kt AM, 

(Kô = KA *, 


Nw RwWN 


Proof Take sets K, L, and M with card K =x, card L =A, and 
card M = pu; for convenience choose them in such a way that any two 
are disjoint. Then each of the equations reduces to a corresponding statement 
about equinumerous sets. For example, x: A = card K x L and A-x= 
card(L x K); consequently showing that x-2=A-x reduces to showing 
that K x L= L x K. Listed in full, the statements to be verified are: 


KUL=LUKandKxL2®LxK. i 
KU(LUM)*(KUL)vu Mand K x (L x M)®(K x L) x M. 
K x (LU M) a (K x L) u (K x M). 

CoM wale y MK. 

M(K x L) a MK x ML, 

M(EK) = &*M)K, 


AM PWN 


Most of the verifications are left as exercises. 

In the case of item 6 we want a one-to-one function H from ™(+K) 
onto “*™)K. For fe “(FK), let H(f) be the function whose value at <l, m> 
equals the value of the function f (m) at 1. 

To see that H is one-to-one, observe that if f # g (both belonging to 
M(-K)), then for some m, the functions f(m) and g(m) differ. This in turn 
implies that for some l, f (m)(I) 4 g(m)(1). Hence 


H(f)(L m) =f (m)(I) + g(m)(1) = H(g)(l, m) 
so that H(f) # H(g). 
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Finally, to see that ran H is all of “*™)K, consider any function 
je ™®K. Then j = H(f), where (for me M) f(m) is the function whose 
value at le L is j(i, m). 4 


The next theorem reassures us that for finite cardinals (i.e., for natural 
numbers), the present arithmetic operations agree with those of Chapter 4. 
In Chapter 4, exponentiation was defined only in passing; for completeness 
we include it here. 


Theorem 6J Let m and n be finite cardinals. Then 


m+n=m+,n, 
mn=m-,n, 


m =m", 


where on the right side we use the operations of Chapter 4 (defined by 
recursion) and on the left side we use the operations of cardinal 
arithmetic. 


Proof We use induction on n. First we claim that for any cardinal 
numbers x and åÀ the following identities are correct. 


(al) «+0=k. 

(a2) k+(4+1)=(k+å4)+1. 
(mi) x-0=0. 

(m2) k: (A+1)=x-å+xk 
(el) x?=1. 

(e2) ktt aK*-K 


In each case, the equation is either trivial or is an immediate consequence 
of Theorem 6I (or both). 
The second piece of information we need is that for a finite cardinal n, 


n+1l=nt 
(with cardinal addition). This holds because n and {n} are disjoint sets of 
cardinality n and 1, respectively, and hence 


n+1=card(n U {n}) = card n* = n*. 


It remains only to go through the motions of the induction. Consider 
any m € w and let 


T={new|m+n=m4+,n. 
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Then 0 e A since n + 0 = n = n + „0, by (al) and (A1), where (A1) is given 
in Theorem 4I. Suppose that k € T. Then 
m+k*=m+(k+1) 
=(m+k)+1 by (a2) 
=(m+,k)+1 since keT 


=(m+,k)* 
=m+,k* by (A2). 
Hence kt e T, T is inductive, and T = œ. 
For multiplication and exponentiation the argument is identical. 4 


Corollary 6K If A and B are finite, then A U B, A x B, and 7A are 
also finite. 


Proof Letm = card A and n = card B. Then we calculate: card(A x B) 
=card A-card B=m:n=m-,neq. A similar argument applies to 7A 
and m”. 

For union we must use disjoint sets: 

AU B=Avu (B-A). 


B — A is a subset of a finite set and hence (by Corollary 6G) is finite. 
Let k = card(B — A). Then card(A U B)=m+k=m+,keo. 4 


The above corollary can also be proved without using arithmetic 
(Exercises 8 and 9). 


Exercises 


10. Prove part 4 of Theorem 6l. 
11. Prove part 5 of Theorem 6I. 
12. The proof to Theorem 6I involves eight instances of showing two sets . 
to be equinumerous. (The eight are listed in the proof of the theorem as 
statements numbered 1-6.) In which of these eight cases does equality 
actually hold? 

13. Show that a finite union of finite sets is finite. That is, show that if B 
is a finite set whose members are themselves finite sets, then (JB is finite. 
14. Define a permutation of K to be any one-to-one function from K 
onto K. We can then define the factorial operation on cardinal numbers 
by the equation 


x! = card{ f | f is a permutation of K}, 


where K is any set of cardinality x. Show that x! is well defined, i.e., the 
value of x! is independent of just which set K is chosen. 
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ORDERING CARDINAL NUMBERS 


We can use the concept of equinumerosity to tell us when two sets A and 
B are of the same size. But when should we say that B is larger than A? 


Definition A set A is dominated by a set B (written A X B) iff there is 
a one-to-one function from A into B. 


For example, any set dominates itself. If A & B, then A is dominated 
by B, since the identity function on A maps A one-to-one into B. More 
generally we have: A <B iff A is equinumerous to some subset of B. This is 
just a restatement of the definition, since f is a function from A into B iff it is 
a function from A onto a subset of B (see Fig. 38). 


Fig. 38. F shows that A B. 


We define the ordering of cardinal numbers by utilizing the concept of 
dominance: 


card A < card B iff AXB. 


As with the operations of cardinal arithmetic, it is necessary to check that 
the ordering relation is well defined. For suppose we start with two 
cardinal numbers, say k and 4. In order to determine whether or not 
K <A, our definition demands that we employ selected representatives K 
and L for which x = card K and å = card L. Then 


K<A iff KXL. 


But the truth or falsity of “x <A” must be independent of which selected 
representatives are chosen. Suppose also that x = card K’ and A = card E. 
To avoid embarrassment, we must be certain that 


KXL iff K'<L. 


To prove this, note first that K ~ K’ and L= L (because card K = card K' 
and card L = card L). If K x L, then we have one-to-one maps (i) from K’ 
onto K, (ii) from K into L, and (iii) from L onto L. By composing the 
three functions, we can map K’ one-to-one into Ľ, and hence K’ X E. 
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We further define 
K<A iff k<å and KA. 


Thus in terms of sets we have 


card K < card L iff KSL and K#L. 


Notice that this condition is stronger than just saying that K is equinumerous 
to a proper subset of L. After all, œ is equinumerous to a proper subset of 
itself, but we certainly do not want to have card w < card œw. The definition 
of “<” has the expected consequence that 


K<A iff either k<åÀ or K=A. 


Examples 1. If AS B, then card A < card B. Conversely, whenever 
K < A, then there exist sets K S L with card K = x and card L = 4. To prove 
this, start with any sets C and L of cardinality x and A, respectively. Then 
C XL, so there is a one-to-one function f from C into L. Let K = ran f; 
then Cx KCL. | 

2. For any cardinal x, we have 0 < x. 

3. For any finite cardinal n, we have n < Na. (Why?) For any two 
finite cardinals m and n, we have 


men > mon > mn. 


Furthermore the converse implications hold. For if m < n, then m n and 
there is a one-to-one function f: m —> n. By the pigeonhole principle, it is 
impossible to have n less than m, so by trichotomy m en. Thus our new 
ordering on finite cardinals agrees with the epsilon ordering of Chapter 4. 

4. « <2* for any cardinal x. For if A is any set of cardinality x, then 
PA has cardinality 2". Then A SZA (map xe A to {x}e PA), but by 
Cantor’s theorem (Theorem 6B) A # PA. Hence x < 2" but k # 2", ie. 
k < 2*, In particular, there is no largest cardinal number. 


The first thing to prove about the ordering we have defined for 
cardinals is that it actually behaves like something we would be willing to 
call an ordering. After all, just using the symbol “<” does not confer any 
special properties, but it does indicate the expectation that special properties 
will be forthcoming. For a start, we ask if the following are valid for all 
cardinals x, A, and u: 


K<K. 

K<SAS pK SB. 
K<ARAKS KS K=HA, 
Eitherk <A orA<k. 


PWN 
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The first is obvious, since A * A holds for any set A. The second item follows 
at once from the fact that 


AXB&BXC => AKC. 


(We prove this by taking, the composition of two functions.) The third item 
is nontrivial. But the assertion is correct, and is called the Schröder- 
Bernstein theorem. We will also prove the fourth item, but that proof will 
require the axiom of choice. 

First we will prove the Schréder—Bernstein theorem, which will be a basic 
tool in calculating the cardinalities of sets. Typically when we want to 
calculate card S for a given set S, we try to squeeze card S between upper 
and lower bounds. If possible, we try to get these bounds to coincide, 


k <card S <k, 


whereupon the Schröder-Bernstein theorem asserts that card S = k. We 
will see examples of this technique after proving the theorem. 


Schroder—Bernstein Theorem (a) If AB and B x 4, then A x B. 
(b) For cardinal numbers x and 4, if x < À and À < Kx, then k =A. 


Proof It is done with mirrors (see Fig. 39). We are given one-to-one 
functions f: A + Band g: B > A. Define C, by recursion, using the formulas 


C,=A-—rang and C,+ = gI f [CJ] 


Thus C, is the troublesome part that keeps g from being a one-to-one 
correspondence between B and A. We bounce it back and forth, obtaining 


C,, C,,.... The function showing that A ~ B is the function h: A > B 
defined by 
h(x) = f (x) if xeC, forsomen, 
g (x) otherwise. 


Note that in the second case (xe A but x ¢ C, for any n) it follows that 
x ¢ Co and hence x e ran g. So g™ '(x) makes sense in this case. 


gly) 


WMO 


Fig. 39. The Schréder—Bernstein theorem. 
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Does it work? We must verify that h is one-to-one and has range B. 
Define D, =f[C,] so that C,. = g[D,]. To show that h is one-to-one, 
consider distinct x and x’ in A. Since both f and g~' are one-to-one, the 
only possible problem arises when, say, x € C,, and x' ¢ new Ca- In this 
case, 

h(x) =f (x) € Dy, 


whereas 
h(x) = g'(x) € Dm 


lest x’ € C,,+. So h(x) # h(x’). 

Finally we must check that ran h exhausts B. Certainly each D, © ran h, 
because D, = h[C,,]}. Consider then a point y in B— (Janew Da. Where is 
g(y)? Certainly g(y) € Cg. Also g(y) € C,,+, because C,+ = g{D, 1] y ¢ D,, and 
g is one-to-one. So g(y) ¢ C, for any n. Therefore h(g(y)) = 9° ‘(g(y)) = y- 
This shows that y € ran h, thereby proving part (a). 


Part (b) is a restatement of part (a) in terms of cardinal numbers. 4 


The Schréder-Bernstein theorem is sometimes called the “Cantor- 
Bernstein theorem.” Cantor proved the theorem in his 1897 paper, but his 
proof utilized a principle that is equivalent to the axiom of choice. Ernst 
Schréder announced the theorem in an 1896 abstract. His proof, published 
in 1898, was imperfect, and he published a correction in 1911. The first 
fully satisfactory proof was given by Felix Bernstein and was published in 
an 1898 book by Borel. 


Examples The usefulness of the Schroéder-Bernstein theorem in calculat- 
ing cardinalities is indicated by the following examples. 


i. IFAS BC and Ax C, then all three sets are equinumerous. To 
prove this, let x = card A = card C and let 4 = card B. Then by hypothesis 
k < À <x, so by the Schréder—Bernstein theorem k = A. 

2, Theset R of real numbers is equinumerous to the closed unit interval 
[0, 1]. For we have | 


(0, 1) = [0, 1] € R, 


and (as noted previously) R œ (0, 1). Thus by the preceding example, all 
three sets are equinumerous. (For a more direct construction of a one-to-one 
correspondence between R and [0, 1], we suggest trying Exercise 4.) 

3. Ifk <A< u, then, as we observed before, k < p. We can now give 
an improved version: 


KS<A<u => K<B, 
K<ASup > K< pL 
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For by the earlier observation we obtain x < y; if equality held, then (as in 
the first example) all three cardinal numbers would coincide. 

4. Rx 2, and hence R = Pw. Thus the set of real numbers is equi- 
numerous to the power set of œ. To prove this it suffices, by the 
Schréder—Bernstein theorem, to show that R <2 and °2 XR. 

To show that R <2, we construct a one-to-one function from the open 
unit interval (0, 1) into °2. The existence of such a function, together 
with the fact that R ~ (0, 1), gives us 


R = (0, 1) <°2. 


The function is defined by use of binary expansions of real numbers; map the 
real whose binary expansion is 0.1100010... to the function in °2 whose 
successive values are 1, 1, 0, 0, 0, 1, 0, .... In general, for a real number 
z in (0, 1), let H(z) be the function H(z): œ > 2 whose value at n equals the 
(n + 1)st bit (binary digit) in the binary expansion of z. Clearly H is one-to- 
one. (But H does not have all of °2 for its range. Note that 0.1000... = 
0.0111... = 4. For definiteness, always select the nonterminating binary 
expansion.) 

To show that °2 < R we use decimal expansions. The function in °2 
whose successive values are 1, 1, 0, 0, 0, 1, 0, ... is mapped to the real 
number with decimal expansion 0.1100010.... This maps °2 one-to-one into 
the closed interval [0, 3]. 

5. By virtue of the above example, 


card R = 2%, 


Consequently the plane R x R has cardinality 


2X0 . 2X0 — QNotNo — 2X0 


Thus the line R is equinumerous to the plane R x R. This will not come 
as a surprise if you have heard of “space-filling” curves. 


The next theorem shows that the operations of cardinal arithmetic have 
the expected order-preserving properties. 


Theorem 6L Let x, 4, and u be cardinal numbers. 


(a) KSAS>Kt+uUSA+h. 

(b) k<AS>K usAsy 

(c) KSASK <}. 

(d) k<å= p" <p’; if not both x and u equal zero. 


Proof Let K, L, and M be sets of cardinality x, å, and u, respectively. 
Then MK has cardinality x“, etc. We assume that x < 2; hence we may 
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select K and L such that K&L. And we may select M so that 
LAM=2@. 
Parts (a), (b), and (c) are now immediate, since 


KUMSLUM, KxMcC&LxM, “MK&™L. 


For part (d), first consider the case in which u = 0. Then x #0 and 
u“ =0 < p+. There remains the case in which u #0, ie, M # Ø. Take 
some fixed a e M. We need a one-to-one function G from *M into +M. For 
any fe *M, define G(f) to be the function with domain L for which 


f(x) if xeK, 


GAN) =, if xeL—K. 


In one line, G(f) = f w ((L — K) x {a}). Then G: §M - *M and G is clearly 
one-to-one. 4 


Example We can calculate the product N, : 2° by the method of upper 
and lower bounds: 


2X0 < No . 250 < 2X0 . 2X0 — 250, 
whence equality holds throughout. 


We would like to show that X, is the least infinite cardinal; that is, 
that N, < x for any infinite cardinal x. This amounts (by the definition of <) 
to showing that œ x A for any infinite A. We might attempt to define a 
one-to-one function g: œ > A by recursion: 


g(0) = some member of A, 


g(n*) = some member of A — g[n* ]. 


Here A — g[n* ] is nonempty, lest A be finite. A minor difficulty here is that 
g(n*) is being defined not from g(n) but from g[n* ] = {g(0), g(1), ..., g(n)}- 
This difficulty is easily circumvented, and will be circumvented in the proof 
of Theorem 6N. A major difficulty is the phrase “some member.” Unless 
we say which member, the above cannot possibly be construed as defining g. 


What is needed here is the axiom of choice, which will enable us to 
convert “some member” into a more acceptable phrase. 
Exercises 


15. Show that there is no set Z with the property that for every set there 
is some member of æ that dominates it. 
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16. Show that for any set S we have S <52, but S æ 52. (This should be 
done directly, without use of AS or cardinal numbers. If F: S > 52, then 
define g(x) = 1 — F(x)(x).) 

17. Give counterexamples to show that we cannot strengthen Theorem 
6L by replacing “<” by “<” throughout. 


AXIOM OF CHOICE 


At several points in this book we have already encountered the need for 
a principle asserting the possibility of selecting members from nonempty 
sets. We can no longer postpone a systematic discussion of such a principle. 
There are numerous equivalent formulations of the axiom of choice. The 
following theorem lists six of them. Others will be found in the exercises. 


Theorem 6M _ The following statements are equivalent. 


(1) Axiom of choice, I. For any relation R, there is a function F S R 
with dom F = dom R. 

(2) Axiom of choice, II; multiplicative axiom. The Cartesian product 
of nonempty sets is always nonempty. That is, if H is a function with domain 
I and if (vi e I) H(i) # Ø, then there is a function f with domain J such that 
(vie I) f (i)e H(i). 

(3) Axiom of choice, III. For any set A there is a function F (a 
“choice function” for A) such that the domain of F is the set of nonempty 
subsets of A, and such that F(B)e B for every nonempty BC A. 

(4) Axiom of choice, IV. Let be a set such that (a) each member 
of £ is a nonempty set, and (b) any two distinct members of are 
disjoint. Then there exists a set C containing exactly one element from each 
member of æ (i.e., for each Be the set C^ B is a singleton {x} for 
some x). 

(5) Cardinal comparability. For any sets C and D, either C < D or 
D xC. For any two cardinal numbers x and å, either x <A or À < K. 

_ (6) Zorn’s lemma. Let . be a set such that for every chain Z € a, 
we have | JB € £. (B is called a chain iff for any C and D in &, either 
CSD or D&C.) Then æ contains an element M (a “maximal” element) 
such that M is not a subset of any other set in £. 


Statements (1)-(4) are synoptic ways of saying that there exist uniform 
methods for selecting elements from sets. On the other hand, statements 
(5) and (6) appear to be rather different. 


Proof in part First we will prove that (1)-(4) are equivalent. 
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(1)= (2) To prove (2), we assume that H is a function with domain I 
and that H(i) # Ø for each ie J. In order to utilize (1), define the relation 


R= {ii, x> |ie 1 & xe H(i}. 


Then (1) provides us with a function F € R such that dom F = dom R = I. 
Then since <i, F(i)> € F S R, we must have F(i) e H(i). Thus the conclusion 
of (2) holds. 

(2) => (4) Let »/ be a set meeting conditions (a) and (b) of (4). Let H 
be the identity function on £ ; then (VB € £) H(B) # Ø. Hence by (2) there 
is a function f with domain » such that (VB € £ )f(B)e€ H(B) = B. Let 
C = ran f. Then for Be x we have Brn C = {f(B)}. (Nothing else could 
belong to B ^n C by condition (b).) 

(4) > (3) Given a set A, define 


sf = {{B} x B | B is a nonempty subset of A}. 


Then each member of is nonempty, and any two distinct members are 
disjoint (if <x, yò e ({B} x B) œ ({B} x B’), then x = B = B’). Let C bea 
set (provided by (4)) whose intersection with each member of s% is a 
singleton: 


C o ({B} x B) = {<B, x}, 


where x e B. It is a priori possible that C contains extraneous elements not 
belonging to any member of £. So discard them by letting F = C ^ (2). 
We claim that F is a choice function for A. Any member of F belongs to 
some {B} x B, and hence is of the form <B, xò for xe B. For any one 
nonempty set B S A, there is a unique x such that <B, xe F, because 
F ^ ({B} x B) is a singleton. This x is of course F(B) and it is a member 
of B. 

(3)=(1) Consider any relation R. Then (3) provides us with a choice 
function G for ran R; thus G(B)e B for any nonempty subset B of ran R. 
Then define a function F with dom F = dom R by 


F(x) = G({y | xRy}). 
Then F(x) € {y | xRy}, i.e., <x, F(x)> e R. Hence F € R. 
It remains to include parts (5) and (6) of the theorem. 
(6)= (1) The strategy behind this application (and others) of Zorn’s 
lemma is to form a collection . of pieces of the desired object, and then 


to show that a maximal piece serves the intended purpose. In the present 
case, we are given a relation R and we choose to define 


£ ={fS R|fis a function}. 
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Before we can appeal to Zorn’s lemma, we must check that æ is closed 
under unions of chains. So consider any chain # & .of. Since every member 
of @ is a subset of R, | )@ is a subset of R. To see that | )@ is a function, 
we use the fact that @ is a chain. If <x, yẹ and <x, z> belong to (J2, then 


Xx, yy eGEeEB and <x 2DEHEB 


for some functions G and H in æ. Either GSH or H&G; in either 
event both <x, y> and <x, z> belong to a single function, so y = z. Hence 
[JZ is in of. Now we can appeal to (6), which provides us with a 
maximal function F in .. We claim that dom F = dom R. For otherwise 
take any x e dom R — dom F. Since x e dom R, there is some y with xRy. 
Define 


F =F u {<x, yd}. 


Then F'e.£, contradicting the maximality of F. Hence dom F = dom R. 
(6)=(5) Let C and D be any sets; we will show that either C D or 
D xC. In order to utilize (6), define 


wf = {f |f is a one-to-one function & dom fS C & ran fS D}. 


Consider any chain Z S æ. As in the preceding paragraph, |_) is a function, 
and a similar argument shows that |) is one-to-one. Next consider 
«x, ye U4; then <x, yefe. Consequently xe C and ye D. Thus 
dom |)@¢ C and ran |] S D. Hence | JB e of and we can apply (6) to 
obtain a maximal fe £. We claim that either dom f=C (in which case 
C XD) or ran f= D (in which case D C since f~! is then a one-to-one 
function from D into C). Suppose to the contrary that neither condition 
holds, so that there exist elements ce C — dom fand d e D — ran f Then 


f'=fu{e, d>} 


is in æ, contradicting the maximality of f. (You will observe that the 
strategy underlying this application of Zorn’s lemma is the same one as in 
the preceding paragraph.) 

At this point we have proved that part of Theorem 6M shown in Fig. 40. 
The proof will be completed in Chapter 7. 


Remarks Zorn’s lemma first appeared in a 1922 paper by Kuratowski. 
Earlier maximality principles that were similar in spirit had been published 
by Felix Hausdorff. The importance of Zorn’s lemma lies in the fact that it is 
well suited to many applications. in mathematics. For example, to prove in 
linear algebra that every vector space has a basis requires some form of 
choice, and Zorn’s lemma is a convenient form to use here. (Take to be 
the collection of all linearly independent sets; then a maximal element will 
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be a basis.) Similarly in proving that there exist maximal proper ideals in a 
ring with identity or in proving that there exist maximal Abelian subgroups 
in a group, Zorn’s lemma provides a suitable tool. 

We can give a plausibility argument for Zorn’s lemma as follows. £ 
cannot be empty, because @ is a chain and so = U Ø e£. Probably 
@ is not maximal, so we can choose a larger set. If that larger set is not 
maximal, we can choose a still larger one. After infinitely many steps, even 
if we have not found a maximal set we have at least formed a chain. So 
we can take its union and continue. The procedure can stop only when we 
finally reach a maximal set. So if only we are patient enough, we should 
reach such a set. 

2 
| 


Fig. 40. This much of Theorem 6M we prove now, the rest is postponed. 
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We now formally add the axiom of choice to our list of axioms. And 
we do it without a marginal stripe. This is because historically the axiom 
of choice has had a unique status. Initially some mathematicians objected 
to the axiom, because it asserts the existence of a set without specifying 
exactly what is in it. To this extent it is less “constructive” than the other 
axioms. Gradually the axiom has won acceptance (at least acceptance by 
most mathematicians willing to accept classical logic). But it retains a 
slightly tarnished image, from the days when it was not quite respectable. 
Consequently it has become wide-spread practice, each time the axiom of 
choice is used, to make explicit mention of the fact. (No such gesture is’ 
extended to the other axioms, which are used extensively and mentioned 
rarely.) 


In Chapter 7 we will complete our axiomatization by adding the 
replacement axioms and the regularity axiom. 


Without the axiom of choice, we can still prove that for a finite set I, if 
H(i) # @ for all ie J, then there is a function f with domain I such that 
f (i) € H(i) for all ie I (Exercise 19). But for an infinite set 7, the axiom of 
choice is indispensable. (If we are making seven choices, we can explicitly 
mention each one; for Ny choices this is no longer possible.) 

In particular, if we are choosing one thing, then we do not need the 
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axiom of choice. (We mention this because people sometimes overreact 
when first hearing of the axiom of choice.) For example, assume that A # Ø. 
Then there exists some yọ in A; we can use yọ in further arguments (to 
show, e.g., that some singleton {yọ} is a subset of A). This makes no use of 
the axiom of choice, even when phrased as “we can choose some fixed 
Yo in A.” 

For a second example, assume we are given a relation R. For any 
particular x in dom R, there exists some yọ for which xRy. And we can 
conclude that for any x in dom R, there is a singleton {yọ} included in 
{t | xRt}. We have not yet used the axiom of choice. What does require 
the axiom of choice is saying that for any x in dom R there is some y, for 
which xRy,, and then putting all these y,’s together into a set, e.g., 


{y, | x e dom R}. 


This in effect is making many choices, one for each x in dom R, which 
may be an infinite set. (A human being cannot perform infinitely many 
actions. But the axiom of choice asserts that, despite our practical 
limitations, there exists, in theory, a set produced by making infinitely many 
choices.) 

Another situation in which the axiom of choice can be avoided is where 
we can specify exactly which object we want to choose. For example, we 
can show, without using the axiom of choice, that there is a choice 
function for œw. Namely, define F: (Pw — {@}) > w by 


F(A) = the least member of A 


for Ø # AS w. The crucial point here is that we can write down an 
expression defining the selected member of A. If œw is replaced by R, for 
example, there is no longer the possibility of defining a way to select 
numbers from arbitrary nonempty subsets of R. 

Since there is a choice function on «, it follows that there is a choice 
function on any set equinumerous to w. (Why?) And similarly any finite set 
has a choice function. 

The following example is due to Russell. If we have N, pairs of shoes, 
then we can select one shoe from each pair without using the axiom of 
choice. We simply select the left shoe from each pair. But if we have N, 
pairs of socks, then we must use the axiom of choice if we are to select 
one sock from each pair. For there is no definable difference between the 
two socks in a pair. 


Example Assume that fis a function from A onto B. We claim that B x A. 
To verify this, recall that by Theorem 3J(b), the proof of which used choice, 
there is a right inverse g: B > A such that fog =I,. And g is one-to-one 
since it has a left inverse. The function g shows that B x A. 
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Conversely assume that B < A and B is nonempty. Then (since B x A) 
there is a one-to-one function g: B —> A. By Theorem 3J(a) there is a left 
inverse f: A > B such that f° g =1,. And f maps A onto B since it has a 
right inverse. 

Hence we can conclude: If B is nonempty, then BX A iff there is a 
function from A onto B. As a special case, we can conclude that a 
nonempty set B is dominated by œw iff there is a function from œ onto B. 
(This special case can be proved without the axiom of choice.) 


We now utilize the axiom of choice to prove that X, is the least infinite 
cardinal number. 


Theorem 6N (a) For any infinite set A, we have œ X A. 
(b) N, <x for any infinite cardinal x. 


Proof Part (b) is merely a restatement of part (a) in terms of cardinals; 
it suffices to prove (a). So consider an infinite set A. The idea is to 
select X, things from A: a first, then a second, then a third, .... Let F be a 
choice function for A (as provided by the axiom of choice, III). We must 
somehow employ F repeatedly; in fact N times. 

As a first try, we could attempt to define a function g:w—A by 
recursion. We know A is nonempty; fix some element a € A. Maybe we can 
take 


g(0) = a, 
g(n* ) = the chosen member of A — {g(0), ..., g(n)} 
= F(A — g{n*}). 


No, we cannot. This would require a stronger recursion theorem than the 
one proved in Chapter 4. We are attempting to define g(n*) not just from 
g(n), but from the entire list g(0), g(1), ..., g(n). 

But our second try will succeed. We will define by recursion a function 
h from œ into the set of finite subsets of A. The guiding idea is to have, 
for example, h(7) = {g(0), ..., g(6)}. Formally, 


h(0) = Ø, 
h(n*) = h(n) o {F(A — h(n))}. 
Thus we start with @, and successively add chosen new elements from A. 
A — h(n) is nonempty because A is infinite and h(n) is a finite subset. And 


h(n*) is again a finite subset of A. 
Now we can go back and recover g. Define 


g(n) = F(A — h(n)) 
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so that g(n) is the thing we added to the set h(n) to make h(n*). Thus 
g: œw > A; we must check that g is one-to-one. 

Suppose then that m # n. One number is less than the other; say men. 
Then m* e n and so 


g(m) € h(m*) & h(n). 
But g(n) ¢ h(n), since 
g(n) = F(A — h(n)) € A — h(n). 
Hence g(m) # g(n). 4 
We can now list some consequences of the foregoing theorem. 


1. Any infinite subset of œw is equinumerous to w. In this special case 
we can avoid using the axiom of choice, since we have a choice function for 
w anyway. 

2. We have by part (b) of the theorem that if x < Ng, then x is finite. 
Since the converse is clear, we have 


K<®,) iff x is finite. 


3. We get another proof that subsets of finite sets are finite (Corollary 
6G). For if 


card A < card n < No, 
then by the Schröder-Bernstein theorem card A < NX,, and hence A is 


finite. 


Another consequence of the preceding theorem is the following char- 
acterization of the infinite sets (proposed by Dedekind in his 1888 book as 
a definition of “infinite”). 


Corollary 6P A set is infinite iff it is equinumerous to a proper subset 
of itself. 


Proof Half of this result is contained in Corollary 6D, where we showed 
that if a set was equinumerous to a proper subset of itself, then it was infinite. 
Conversely, consider an infinite set A. Then by the above theorem, there is 
a one-to-one function f from @ into A. Define a function g from A into 
A by 


g(f(n))=f(n*) for neo, 
g(x) = x for xéranf 
(see Fig. 41). Then g is a one-to-one function from A onto A — {f (0}. 4 
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Fig. 41. f (0) is not in ran g. 


Exercises 


18. Prove that the following statement is equivalent to the axiom of choice: 
For any set . whose members are nonempty sets, there is a function f with 
domain s% such that f (X) e X for all X in æ. 


19. Assume that H is a function with finite domain I and that H(i) is 
nonempty for each i e J. Without using the axiom of choice, show that there 
is a function f with domain I such that f (i) € H(i) for each i € I. [Suggestion: 
Use induction on card I.] 


20. Assume that A is a nonempty set and R is a relation such that 
(vx € A)(dy € A) yRx. Show that there is a function f: œw > A with f (n* )Rf (n) 
for all n in œ. 


21. (Teichmiiller-Tukey lemma) Assume that »/ is a nonempty set such 
that for every set B, 


Bed <> every finite subset of B is a member of <£. 


Show that ./ has a maximal element, i.e., an element that is not a subset 
of any other element of <. 


22. Show that the following statement is another equivalent version of the 
axiom of choice: For any set A there is a function F with dom F = (JA 
and such that x e F(x) A for all xe (JA. 


23. Show that in the proof to Theorem 6N, we have g[n] = h(n). 


24. How would you define the sum of infinitely many cardinal numbers? 
Infinite products? 


25. Assume that S is a function with domain œ such that S(n) € S(n*) for 
each n € œ. (Thus S is an increasing sequence of sets.) Assume that B is a 
subset of the union |] „ea S(n) such that for every infinite subset B’ of B 
there is some n for which B’ ^ S(n) is infinite. Show that B is a subset 
of some S(n). 
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COUNTABLE SETS 


The definition below applies the word “countable” to those sets whose 
elements can, in a sense, be counted by means of the natural numbers. A 
“counting” of a set can be taken to be a one-to-one correspondence between 
the members of the set and the natural numbers (or the natural numbers 
less than some number n). This requires that the set be no larger than w. 


Definition A set Ai is countable iff A <a, i.e., iff card A < Xp. 


Since we have recently found that 
K<®, <= «x is finite, 


we can also formulate the definition as follows: A set A is countable iff either 
A is finite or A has cardinality X,- 

For example, the set w of natural numbers, the set Z of integers, and the 
set @ of rational numbers are all infinite countable sets. But the set R of 
real numbers is uncountable (Theorem 6B). 

Any subset of a countable set is obviously countable. The union of two 
countable sets is countable, as is their Cartesian product. (The union has at 
most cardinality Nọ + N,, the product at most %.-%,. But both of these 
numbers equal Ny.) On the other hand, #A is uncountable for any 
infinite set A. (If 2° < No, then x < Xo.) 

Recall (from the example preceding Theorem 6N) that a nonempty set B 
is countable iff there is a function from œ onto B. This fact is used in the 
proof of the next theorem. 


Theorem 6Q A countable union of countable sets is countable. That is, 
if sf is countable and if every member of ./ is a countable set, then |). 
is countable. 


Proof We may suppose that Ø ¢ £, for otherwise we could simply 
remove it without affecting |]. We may further suppose that £ # Ø, 
since J @ is certainly countable. Thus æ is a countable (but nonempty) 
from @ x w onto |)... We already know of functions from @ onto œ x w, 
and the composition will map œw onto | J.~, thereby showing that |]. 
is countable. 


Since æ% is countable but nonempty, there is a function G from œ onto 
sf. Informally, we may write 


= {G(0), G(1), ..}. 


(Here G might not be one-to-one, so there may be repetitions in this 
enumeration.) We are given that each set G(m) is countable and nonempty. 
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Hence for each m there is a function from œ onto G(m). We must use the 
axiom of choice to select such a function for each m. 

Because the axiom of choice is a recent addition to our repertoire, we 
will describe its use here in some detail. Let H: w > (| |x) be defined by 


H(m) = {g | g is a function from w onto G(m)}. 


We know that H(m) is nonempty for each m. Hence there is a function F 
with domain œ such that for each m, F(m) is a function from œ onto G(m). 

To conclude the proof we have only to let f(m, n) = F(m)(n). Then f 
is a function from œ x œ onto |). d 


Example For any set A, define a sequence in A to be a function from 
some natural number into A. Let Sq(A) be the set of all sequences in A: 


Sq(A) = {f | (ane œ) f maps n into A} 
="AVUIAVU7AU'O. 
The length of a sequence is simply its domain. 
In order to verify that Sq(A) is a legal set, note that if f: n > A, then 
fenxAcaxaA, 
so that f € P(w x A). Hence Sq(A) € P(w x A). 


We now list some observations that establish the existence. of tran- 
scendental real numbers. 


1. Sq(a) has cardinality Nọ. This can be proved by using primarily 
the fact that w x œw = œ. Another very direct proof is the following. Consider 
any f € Sq(w); say the length of f is n. Then define 


H(f) = 2042 AFL a pf DHE, 


where p; is the (i + 1)st prime. (If the length of f is 0, then H(f)=1,) 
Thus H: Sq(w) > w and by the fundamental theorem of arithmetic (which 
states that prime factorizations are unique) H is one-to-one. Hence card 
Sq(w) < No, and the opposite inequality is clear. (In Chapter 4 we did not 
actually develop the theory of prime numbers. But there are no difficulties 
in embedding any standard development of the subject into set theory.) 

2. Sq(A) is countable for any countable set A. By the countability of 
A there is a one-to-one function g from A into w. This function naturally 
induces a one-to-one map from Sq(A) into Sq(w). Hence card Sq(A) < card 
Sq(w) = X, - (An alternative proof writes Sq(A) = ){"A | n € œ}, a countable 
union of countable sets.) 

We can think of this set A as an alphabet, and the elements of Sq(A) 
as being words on the alphabet A. In this terminology, the present example 
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can be stated: On any countable alphabet, there are countably many words. 

3. There are Ny algebraic numbers. (Recall that an algebraic number 
is a real number that is the root of some polynomial with integer 
coefficients. For this purpose we exclude from the polynomials the function 
that is identically equal to 0.) As a first step in counting the algebraic 
numbers, note that the set Z of integers has cardinality Ny + No = No. 
Next we calculate the cardinality of the set P of polynomials with integer 
coefficients. We can assign to each polynomial (of degree n) its sequence 
(of length n + 1) of coefficients, e.g, 1 + 7x — 5x? + 3x* is assigned the 
sequence of length 5 whose successive values are 1, 7, —5, 0, 3. This 
defines a one-to-one map from P into Sq(Z), a countable set. Hence P is 
countable. Since each polynomial in P has only finitely many roots, the set 
of algebraic numbers is a countable union of finite sets. Hence it is countable, 
by Theorem 6Q. Since the set of algebraic numbers is certainly infinite, it 
has cardinality X,. 

4. There are uncountably many transcendental numbers. (Recall that 
a transcendental number is defined to be a real number that is not algebraic.) 
Since the set of algebraic numbers is countable, the set of transcendental 
numbers cannot also be countable lest the set R be countable. (Soon we 
will be able to show that the set of transcendental numbers has cardinality 
250.) 


Exercises 


26. Prove the following generalization of Theorem 6Q: If every member 
of a set £ has cardinality x or less, then 


card |). < (card £) x. 


27. (a) Let A be a collection of circular disks in the plane, no two of 
which intersect. Show that A is countable. 
(b) Let B be a collection of circles in the plane, no two of which 
intersect. Need B be countable? 
(c) Let C be a collection of figure eights in the plane, no two of 
which intersect. Need C be countable? 


28. Find a set » of open intervals in R such that every rational number 
belongs to one of those intervals, but |). # R. [Suggestion: Limit the 
sum of the lengths of the intervals. ] 


29. Let A bea set of positive real numbers. Assume that there is a bound 
b such that the sum of any finite subset of A is less than b. Show that 
A is countable. 


30. Assume that A is a set with at least two elements. Show that 
Sq(A) °A. 
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ARITHMETIC OF INFINITE CARDINALS 


We can use the axiom of choice to show that adding or multiplying 
two infinite cardinals is a more trivial matier than it first appeared to be. 


Lemma 6R For any infinite cardinal x, we have k: K =k. 


Proof Let B be a set of cardinality x. It would suffice to show that 
Bx Br B, and we will almost do this. Define 


KH ={f | f = Ø or for some infinite A € B, f isa 
one-to-one correspondence between A x A and A}. 


Our strategy is to use Zorn’s lemma to obtain a maximal function fo 
in #. Although f} might not quite show that B x B % B, it will come 
close enough. (The reason for including Ø as a member of # is that our 
statement of Zorn’s lemma requires that # contain the union of any 
chain, even the empty chain.) 

Before applying Zorn’s lemma to #, we must check for closure under 
unions of chains. Let @ be a chain included in #. We may suppose 
that & contains some nonempty function, since otherwise | )@ = Ø e #. 
As we have seen before, (_)@ is again a one-to-one function. Define 


A=\(j{ran f | f E8} = ran | JE 


(compare Exercise 8 of Chapter 3). A is infinite since @ contains some 
nonempty function. We claim that E is a one-to-one correspondence 
between A x A and A. The only part of this claim not yet verified is that 
dom | )@ = A x A. First consider any <a,, a,) € A x A. Then q€ ran f, 
and a,€ ran f, for some f, and f, in @. Either fi Sf, or f, Sfi; by 
symmetry we may suppose that f, S fa. Then 


<a, a,) E€ ran f, x ran f, = dom f, S |]{domf |f € €} = dom (4. 


Conversely any member of dom Je belongs to dom f for some f € @. 
But dom f = ran f x ran f S'A x A. Thus dom | )@ = A x A. So |]€ isa 
one-to-one correspondence between A x A and A; hence JE EX. 

Zorn’s lemma now provides us with a maximal f} € #. First, we must 
check that fy # Ø. Since B is infinite, it has a subset A of cardinality 
No- Because N, : Np = No, there is a one-to-one correspondence g between 
Ax A and A. Thus ge #; since g properly extends @, it is impossible 
for Ø to be maximal is #. Hence fy # Ø. So by the definition of #, 
fy is a one-to-one correspondence between Ao x Ay and Ago, where Ao 
is some infinite subset of B. 

Let 4 = card A, ; then 2 is infinite and 4: 4 = A. One might hope that 
by virtue of the maximality, Ay would be all of B. This may not be true, 
but we will show that å = x and that B — A, has smaller cardinality. 
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In order to show that card (B — A,) < A, suppose that to the contrary 
A < card(B — A,). Then B — A, has a subset D of cardinality 4. We will 
show that this contradicts the maximality of fọ by finding a proper 
extension of fọ that is a one-to-one correspondence between the sets 
(Ay Y D) x (Ag Y D) and A, Y D. We have 


(Ay Y D) x (Ay U D) = (Ap X Ag) Y (4o xX D) Y (D x Ao) Y (D x D) 


Ao D 
Fig. 42. The set B x B (in Lemma 6R). 


(see Fig. 42). Of the four sets on the right side of this equation, 
Ay X Ag is already paired with A, by fọ. The remainder 


(A, x D)o (D x Ay) Y (D x D) 
(shaded in Fig. 42) clearly has cardinality 


AAtdAA+AAHzAFAFA 
=3:-A 
<A-d 
= J, 
Hence there is a one-to-one correspondence g between the shaded sets and 
D. Then fy U ge # and properly extends f}, contradicting the maximality 


of fọ. Hence our supposition that A < card(B — Aj) is false. By cardinal 
comparability, card(B — Ay) < 4. 


164 6. Cardinal Numbers and the Axiom of Choice 


Finally we have 
K = card A, + card(B — A) 
SA+AEz2R ASA AHNEK, 


whence A = x. Hence kK: K = K. 4 


Absorption Law of Cardinal Arithmetic Let « and å be cardinal numbers, 
the larger of which is infinite and the smaller of which is nonzero. Then 


K+A=xk-A=max(k, A). 
Proof By the symmetry we may suppose that 2 < x. Then 
KS K+AKK4+KSK'25K'K=K 
and 
KS K'‘AKKKE=K. 
Hence equality holds throughout. 4 


Example We do not have a well-defined subtraction operation for infinite 
cardinal numbers. If you start with NX, things and take away Nọ things, 
then the number of remaining things can be anywhere from 0 to No, 
depending on which items were removed. But if you start with x things 
(where x is infinite) and remove A things where A is strictly less than x, 
then exactly x things remain. To prove this, let u be the cardinality of the 
remaining items. Then x = A + u = max(A, u), so that we must have p = K. 


Example There are exactly 2% transcendental numbers. This follows 
from the preceding example. If from the 2%? real numbers we remove the 
countable set of algebraic numbers, then 28° numbers remain. 


Example For any infinite cardinal x, we have x“ = 2°. To prove this, 
observe that 


K“ < (2) = DKK = 2" < K“, 
whence equality holds throughout. 
We conclude this section with two counting problems: 


1. How many functions from R into R are there? 
2. How many of these are continuous? 


The first question asks for the cardinality of the set PR. The cardinal 
number of this set is 


(202%? = Xo (2%0) _92¥0 
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This last expression cannot be further simplified; it provides the answer to 
the first question. (As with finite numbers, x“ means x” and not (x*)".) 

Now consider the second question; let C(R) be the set of continuous 
functions in FR. It is easy to see that 


250 < card C(R) < 2?"°, 


but we need an exact answer. We claim that card C(R) = 2°. To prove 
this, we will consider the restriction of the continuous functions to the set 
Q of rational numbers (where Q is regarded as a subset of R). If f and g 
are two distinct continuous functions, then f — g is not identically zero. 
(Here f —g is the result of subtracting g from f, not the relative 
complement.) Hence by continuity, there is an open interval throughout 
which f — g is nonzero. In this interval lies some rational, sof | Q # g t Q. 
Hence the map from C(R) into °R assigning to each continuous function 
fits restriction f | Q is a one-to-one map. Thus C(R) < °R and so 


card C(R) < card ®R = (2%°)8o = 2X0, 


Exercises 


31. In the proof of Lemma 6R we utilized a certain set #. How do we 
show from the axioms that such a set exists? 


32. Let FA be the collection of all finite subsets of A. Show that if A is 
infinite, then A = F A. 


33. Assume that A is an infinite set. Prove that A % Sq(A). 
34. Assume that 2<x < å and 4 is infinite. Show that xô = 27. 


35. Find a collection . of 2° sets of natural numbers such that any two 
distinct members of Z have finite intersection. [Suggestion: Start with the 
collection of infinite sets of primes.] 


36. Show that for an infinite cardinal k, we have x! = 2", where x! is 
defined as in Exercise 14. 


CONTINUUM HYPOTHESIS 


We have in this chapter given some examples of countable sets and 
uncountable sets. But every uncountable set examined thus far has had 
cardinality 25° or more. This raises the question: Are there any sets with 
cardinality between N, and 25°? The “continuum hypothesis” is the assertion 
that the answer is negative, i.e., that there is no x with N, < x < 2%. 
Or equivalently, the continuum hypothesis can be stated: Every uncountable 
set of real numbers is equinumerous to the set of all real numbers. 
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Cantor conjectured that the continuum hypothesis was true. And 
David Hilbert later published a purported proof. But the proof was 
incorrect, and more recent work has cast doubt on the continuum 
hypothesis. In 1939 Gödel proved that on the basis of our axioms for 
set theory (which we here assume to be consistent) the continuum hypothesis 
could not be disproved. Then in 1963 Paul Cohen showed that the 
continuum hypothesis could not be proved from our axioms either. 

But since the continuum hypothesis is neither provable nor refutable 
from our axioms, what can we say about its truth or falsity? We have 
some informal ideas about what sets are like, but our intuition might 
not assign a definite answer to the continuum hypothesis. Indeed, one might 
well question whether there is any meaningful sense in which one can 
say that the continuum hypothesis is either true or false for the “real” sets. 
Among those set-theorists nowadays who feel that there is such a meaningful 
sense, the majority seems to feel that the continuum hypothesis is false. 

The “generalized continuum hypothesis” is the assertion that for every 
infinite cardinal x, there is no cardinal number between x and 2". Gédel’s 
1939 work shows that even the generalized continuum hypothesis cannot 
be disproved from our axioms. And of course Cohen’s result shows that 
it cannot be proved from our axioms (even in the special case x = X,). 

There is the possibility of extending the list of axioms beyond those in 
this book. And the new axioms might conceivably allow us to prove or to 
refute the continuum hypothesis. But to be acceptable as an axiom, a 
statement must be in clear accord with our informal ideas of the concepts 
being axiomatized. It would not do, for example, simply to adopt the 
generalized continuum hypothesis as a new axiom. It remains to be seen 
whether any acceptable axioms will be found that settle satisfactorily the 
correctness or incorrectness of the continuum hypothesis. 

The work of Gödel and Cohen also shows that the axiom of choice can 
neither be proved nor refuted from the other axioms (which we continue 
to assume are consistent). But unlike the continuum hypothesis, the axiom 
of choice conforms to our informal view of how sets should behave. For 
this reason, we have adopted it as an axiom. 

Results such as those by Gédel and Cohen belong to the metamathematics 
of set theory. That is, they are results that speak of set theory itself, in 
contrast to theorems within set theory that speak of sets. 


CHAPTER 7 


ORDERINGS AND ORDINALS 


In this chapter we will begin by discussing both linear orderings 
(mentioned briefly in Chapter 3) and, more generally, partial orderings. 
But we will soon focus our attention on a special case of the linear 
orderings, namely the so-called well orderings. The well orderings will lead 
us to the study of ordinal numbers and to the fulfillment of promises 
(e.g., the definition of cardinal numbers) that have been made. 


PARTIAL ORDERINGS 


A partial ordering is a special sort of relation. Before making any 
definitions, it will be helpful to consider a few examples. 


1. Let S be any fixed set, and let <, be the relation of strict inclusion 
on subsets of S: 


cs ={(4,B>)|ASBOS&AFB). 


Of course we write “ A <, B” in place of the more awkward “<A, Bẹ € c s” 
2. Let P be the set of positive integers. The strict divisibility relation 
on P is 
Ka, beP x P | a:q = b for some q # 1}. 
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For example, 5 divides 60 (here q = 12), but 2 does not divide 3, nor 
does 3 divide 2. No number in P strictly divides itself. 

3. For the set R of real numbers, we have the usual ordering relation <. 
For any distinct real numbers x and y, either x < y or y < x. 


Definition A partial ordering is a relation R meeting the following two 
conditions: 


(a) R isa transitive relation: xRy & yRz = xRz. 
(b) R is irreflexive: It is never the case that xRx. 


In the foregoing examples, it is easy to see that <,, strict divisibility, 
and < are all partial ordering relations. The preferred symbols for partial 
ordering relations are < and similar symbols, e.g, <, <, and the like. 
If < is such a relation, then we can define: 


x<y iff either x<y or x=y. 


{0, 1, 2} 


{0,1} {1, 2} 


Ss 


{2} 


D 
Fig. 43. The inclusion ordering on {0, 1, 2}. 
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1 
Fig. 44. A finite part of an infinite picture. 


It is easy to see that, for example, 
xs y<z > X<zZ 


At the end of Chapter 3 we drew some pictures (Fig. 15) of orderings. 
For partial orderings the pictures show more variety. Again we represent 
the members of A by dots, placing the dot for x below the dot for y 
whenever x < y, and drawing lines to connect the dots. For example, 
Fig. 43 is a picture of the partial ordering <, on the set S = {0, 1, 2}. 
There is no need to add a direct nonstop line from @ to {0, 1}, because 
there are lines from Ø to {0} and from {0} to {0, 1}. Figure 44 is a picture 
of a small piece of the divisibility ordering on the positive integers; the 
full picture would have to be infinite. 

The following theorem lists some easy consequences of our definitions. 


Theorem 7A Assume that < is a partial ordering. Then for any x, y, 
and z: 


(a) At most one of the three alternatives, 
x<y, %xX=y yx, 


can hold. 
(b) x<sy<sx>x=y. 


Proof In part (a), if we had both x < y and x = y, then we would 
have x < x, contradicting irreflexivity. And if both x < y and y < x, then 
by transitivity x < x, again contradicting irreflexivity. 
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In part (b) if x # y then we would have x < y < x, contradicting part 
(a). 4 


We are particularly interested in cases where part (a) of this theorem 
can be strengthened to read “exactly one,” i.e., trichotomy holds. 


Definition R is a linear ordering on A iff R is a binary relation on A 
that is a transitive relation and satisfies trichotomy on A, i.e., for any 
x and y in A exactly one of the three alternatives, 


xRy, X=y, yRx, 
holds. 


Note that we have to specify the set A in this definition. We can also 
speak of a partial ordering on A, this being just a binary relation on A 
that is a partial ordering. 


Examples The usual ordering relation on R is linear. Strict divisibility is 
a partial ordering, but is not a linear ordering on the positive integers. 
And inclusion c, is nonlinear if S has two or more members. 


Notice that if R is a linear ordering on A, then R is also a partial 
ordering (since trichotomy implies irreflexivity—recall Theorem 3R). 


Digression In the study of partial orderings, there is always the question 
whether to use strict orderings (<) or weak orderings (<) as the basic 
concept. We have taken the “<” course, by demanding that a partial 
ordering be irreflexive. For the “ <” version, one demands that a partial 
ordering on A be reflexive on A. Yet a third course is to specify neither 
extreme, but to allow any pairs <x, xò to belong to the ordering or not, 
as they please. Each alternative has its own minor advantages and its own 
drawbacks. One feature of demanding reflexivity is that whenever < is a 
partial ordering on A, then A = fld <. Consequently we can treat just the 
ordering < ; theset A is then encoded inside this relation. For strict orderings 
this feature is lost; for example, @ is a partial ordering on any set. 
Often we will want to refer to a set A together with an ordering < on A. 


Definition A structure is a pair <A, Rò consisting of a set A and a 
binary relation R on A (i.e, RS A x A). 


In particular, we can speak of a partially (or linearly) ordered structure 
if R is a partial (or linear) ordering relation on A. (The terms poset and 
loset are sometimes used here.) 

There is a certain amount of terminology for orderings that has proved 
sufficiently useful to have become standard. Let < be a partial ordering; 
let D be a set. An element m of D is said to be a minimal element of D 
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iff there is no x in D with x< m (or equivalently, iff for all x in D, 
x £ m). And m is a least (or smallest or minimum) element of D iff m < x 
for all x in D. A least element is also minimal. For a linear ordering 
on a set that includes D the two concepts coincide, since 


xm > m<x 


for a linear ordering. But in the nonlinear case, minimality is weaker than 
leastness. 


Example Consider the strict divisibility relation on the set P of positive 
integers. Then 1 is the least element of P. But let D be {ae P|a # 1}. 
Then every prime is a minimal element of D, and D has no least element. 


A least element of a set (if one exists) is automatically unique. 
(If m, and m, are both least, then m, < m, and m, < m,, whence equality 
holds.) In this event, the least element is the only minimal element. But a 
set can have many minimal elements, as the above example shows. It is 
also possible for a set to have no minimal elements at all. (What is an 
example of such a set?) 

Similarly an element m of D is a maximal element of D iff there is no 
x in D with m < x (or equivalently, iff for all x in D, m £ x). And m is a 
greatest (or largest or maximum) element of D iff x <m for all x in D. 
The remarks we have made concerning minimal and least elements carry 
over to maximal and greatest elements. That is, a greatest element is also 
maximal. For a linear ordering (on a set that includes D) the two concepts 
coincide. A set can have only one greatest element. But if a set has no 
greatest element, it can have many maximal elements (or it might have none). 

If R is a partial ordering, then R`! is also a partial ordering 
(Exercise 2). And for a set D, the minimal elements of D with respect to 
R`! are exactly the maximal elements with respect to R. When it is 
necessary to specify the ordering, we can speak of “R-minimal” elements. 

Suppose that < is a partial ordering on A and consider a subset 
C of A. An upper bound of C is an element be A such that x < b for all 
xe cC. Here b may or may not belong to C; if it belongs to C, then it is 
clearly the greatest element of C. If b is the least element of the set of all 
upper bounds for C, then b is the least upper bound (or supremum) of C. 
Lower bound and greatest lower bound (or infimum) are defined analogously. 


Example Consider a fixed set S and the partial ordering <, on AS. For 
A and B in &S, the set {A, B} has a least upper bound (with respect to 
©ş), namely A U B. Similarly A ^ B is the greatest lower bound of {A, B}. 
If of S PS, then |]. is the least upper bound of /. And (|. (if £ + Ø) 
is the greatest lower bound. 
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Example Next consider the set Q of rational numbers with its usual 
linear ordering relation <. The set of positive rational numbers has no 
upper bounds at all. The set {x € Q | x? < 2} has upper bounds in Q, but 
no least upper bound in Q. 


Example In the set of positive integers ordered by divisibility, greatest 
lower bounds are called “greatest common divisors” (g.c.d.’s). And least 
upper bounds are called “least common multiples” (1.c.m.’s). In this ordering, 
any finite set has an l.c.m., but infinite sets have no upper bounds. Any 
nonempty set of positive integers has a g.c.d. 


Exercises 


1. Assume that < ,and < are partial orderings on A and B, respectively, 
and that f is a function from A into B satisfying 


x<yy > f(x)<,f() 


for all x and y in A. 
(a) Can we conclude that f is one-to-one? 
(b) Can we conclude that 


x<yy > S) <O) 


2. Assume that R is a partial ordering. Show that R~! is also a partial 
ordering. 


3. Assume that S is a finite set having n elements. Show that a linear 
ordering on S contains $n(n — 1) pairs. 


WELL ORDERINGS 


Definition A well ordering’ on A is a linear ordering on A with the 
further property that every nonempty subset of A has a least element. 


For example, we proved in Chapter 4 that the usual ordering on @ is 
a well ordering. But the usual ordering on the set Z of integers is not a 
well ordering; Z has no least element. 

Weil orderings are significant because they can be used to index 
constructions that proceed “from the bottom up,” where at every stage in 
the construction (except the last) there is a unique next step. This idea is 


1 This is a grammatically unfortunate phrase, in that it uses “well” as an adjective. Some 
authors insert a hyphen, which helps a little, but not much. 
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made precise in the transfinite recursion theorem, which we will encounter 
presently. 

We can very informally describe what well orderings are like; later 
we will give exact statements and proofs. Assume that < is a well ordering 
on A. Then the set A itself, if nonempty, has a least element tọ. And 
then what is left, A — {to}, has (if it is nonempty) a least element t,. Next 
consider A — {t,, t,}, and so forth. We obtain 


ty <ty<ty<-". 


There may be still more in A — {t), t,, ...}, in which case there is a least 
element t,,. And this might continue: 


to <t <i <ty < to <U < laa <toogr <0 


Eventually we use up all the elements of A, and things grind to a halt. 
We can use the axiom of choice to obtain another way of characterizing 
the well orderings. 


Theorem 7B Let < bea linear ordering on A. Then it is a well ordering 
iff there does not exist any function f: w > A with f(n*) < f(n) for every 
nEw. 


A function f: œ > A for which f(n*) < f (n) (for all n e€ œ) is sometimes 
called a descending chain, but this terminology should not be confused with 
other uses we make of the word “chain.” The theorem asserts that, for a 
linear ordering <, it is a well ordering iff there are no descending chains. 


Proof If there is a descending chain f, then ran f is a nonempty subset 
of A. And clearly ran f has no least element; for each element f (n) there is 
a smaller element f (n*). Hence the existence of a descending chain implies 
that < is not a well ordering. 

Conversely, assume that < is not a well ordering on A, so that some 
nonempty Bc A lacks a least element. Then (Vx e B)(3y e B) y < x. By 
Exercise 20 of Chapter 6, there is a descending chain f:w—B with 
f(n*) < f(n) for each n in œ. 4 


If < is some sort of ordering on A (at least a partial ordering) and 
te A, then the set 


seg t = {x|x < t} 


is called the initial segment up to t. (A less ambiguous notation would be 
seg. t, but in practice the simpler notation suffices.) For example, œ was 
ordered by e, and consequently we had for ne a, 


segn={x|xenb=n. 
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Transfinite Induction Principle Assume that < is a well ordering on A. 
Assume that B is a subset of A with the special property that for every t 
in A, 

segtS B => teB. 
Then B coincides with A. 


Before proving this, let us define a subset B of A to be a <-inductive 
subset of A iff it has this special property, i.e., for every t in A, 


segroB => teB. 


This condition, in words, states that ts membership in B is guaranteed 
once it is known that all things less than t are already in B. The 
transfinite induction principle can be restated: If < is a well ordering on A, 
then any <-inductive subset of A must actually coincide with A. 


Proof If B is a proper subset of A, then A — B has a least element m. 
By the leastness, y e B for any y < m. But this is to say that seg m S B, so 
by assumption me B after all. 4 


The above proof is so short that we might take a second glance at the 
situation. First take the least element tọ of A. Then seg ty = Ø, and so 
automatically seg tj © B. The assumption on B then tells us that t, € B. 
Next we proceed to the least element t, of A — {to}. Then seg t, = {tọ} © B, 
so we obtain t, € B. And so forth and so forth. But to make the “and so 
forth” part secure, we have the foregoing actual proof of the principle. 

The transfinite induction principle is a generalization of the strong 
induction principle for w, encountered at the end of Chapter 4. The 
following theorem is somewhat of a converse to transfinite induction. It 
asserts that the only linear orderings for which the transfinite induction 
principle is valid are the well orderings. Although we make no later use of 
this theorem, it is of interest in showing why one might choose to study 
well orderings. 


Theorem 7C Assume that < is a linear ordering on A. Further assume 
that the only <-inductive subset of A is A itself. That is, assume that for 
any BC A satisfying the condition 


(tx) (vte A)\(segtS B => teB), 
we have B = A. Then < is a well ordering on A. 


Proof Let C be any subset of A; we will show that either C has a 
least element or C is empty. We decide (for reasons that may seem 
mysterious at the moment) to consider the set B of “strict lower bounds” 
of C: 

B = {te A | t < x for every x e€ Ch. 
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Note that BO C= Ø, lest we have t<t. We ask ourselves whether 
condition (yz) holds for B. 


CaseI Condition (yx) fails. Then there exists some t e A with seg t & B 
but t ¢ B. We claim that t is a least element of C. Since t ¢ B, there is 
some xe C with x <t. But x cannot belong to seg t, which is disjoint 
from C. Thus x = t and te C. And t is least in C, since anything smaller 
than t is in seg t and hence not in C. 


Case II Condition (7x) holds. Then by the hypothesis of the theorem, 
A = B. Consequently in this case C is empty. 4 


Next we turn to the important business of defining a function on a 
well-ordered structure by transfinite recursion. Assume that < is a well 
ordering on A. Conceivably we might possess some rule for defining a 
function value F(t) at t € A, where the rule requires knowing first all values 
F(x) for x < t. Then we can start with the least element tọ of A, apply 
our rule to find F(t,), go on the next element, and so forth. That phrase 
“and so forth” has to cover a great deal of ground. But, because < is a 
well ordering, after defining F on any proper subset B of A, we always 
have a unique next element (the least element of A — B) to which we can 
apply our rule so as to continue. 

Now let us try to make these ideas more concrete. The “rule” of the 
preceding paragraph might be provided by a function G. Then F(t) for 
te A is to be found by applying G to the values F(x) for x < t: 


F(t) = G(F | seg t). 


We will say that F is “G-constructed” if the above equation holds for 
every t e dom F. For the right side of this equation to succeed, the domain 
of G must contain all functions of the form F | seg t. This leads us to 
define, for a set B, the set “4B of all functions from initial segments of 
< into B: 


“4B = {f | for some t e A,f is a function from seg t into B}. 


To check that there is indeed such a set, observe that if f: seg t > B, 
then f S A x B. Hence “4B is obtainable by applying a subset axiom to 
P(A x B). 


Transfinite Recursion Theorem, Preliminary Form Assume that < is a 
well ordering on A, and that G: < 4B —> B. Then there is a unique function 
F: A > B such that for any te A, 


F(t) = G(F f seg t). 
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Example In the case of the well-ordered set œw, we have for each n in œ 
the equation seg n = {x | xe n} = n. Hence the above theorem asserts the 
existence of a unique F: w > B satisfying for every n in œw 


F(n) = G(F Ù n). 
In particular, we have 
F(0) = G(F | 0) = G(@), 
F(1) = G(F | 1) = G({<0, F(0)>}), 
F(2) = G(F | 2) = G({<0, F(0)>, <1, F(1)>}). 


This can be compared with the recursion theorem of Chapter 4. There 
is the difference that in Chapter 4 the value of F(n) was constructed by 
using only the one immediately preceding value of the function. But now 
in constructing F(n) we can use all of the previous values, which are given 
to us by F [ n. (This alteration is not made solely for reasons of generosity. 
It is forced on us by the fact that in an arbitrary well ordering, there 
may not always be an “immediately preceding” element.) 


Before proving the transfinite recursion theorem, we want to state it in 
a stronger form. In our informal comments we spoke of a “rule” for 
forming F(t) from the restriction F | seg t. We want to be broad-minded 
and to allow the case in which the rule is not given by a function. We 
have in mind such rules as 


F(t) = {F(x) | x < t} = ran(F [ seg t). 


There is no function G such that G(a) = rana for every set a; such a 
function would have to have everything in its domain, but its domain is 
merely a set. To be sure, for any fixed set B there is indeed a function G 
with G(a) = ran a for every ae B. But our desire is to avoid having to 
produce in advance the set B. 

In terms of proper classes, we can formulate the improved version of 
transfinite recursion as follows. Let < be a well ordering on A and let G be 
a “function-class,” i.e., a class of ordered pairs that satisfies the definition 
of a function except for not being a set. Further suppose that the domain 
of G includes <“V, where < 4V is the class 


{f | for some t in A,f is a function with dom f =seg t}. 


Then there exists a unique function F with domain A that is “G-constructed” 
in the sense that 


F(t) = G(F Ì seg t) 
for all te A. (Here F is a set.) 
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Because we are working in Zermelo-Fraenkel set theory, we must reword 
the above formulation of transfinite recursion so as to avoid reference to 
the class G. Instead we allow ourselves only a formula y(x, y) that defines 
G: 


G = {<x, y> | v(x, y) 
Applying this rewording (the standard rewording for these predicaments), 


we obtain the statement below. 


Transfinite Recursion Theorem Schema For any formula y(x, y) the 
following is a theorem: 


Assume that < is a well ordering on a set A. Assume that for any 
f there is a unique y such that y(f, y). Then there exists a unique 
function F with domain A such that 
(F I seg t, F(t) 
for all t in A. 
This is a theorem schema; that is, it is an infinite package of theorems, 
one for each formula y(x, y). (The formula y(x, y) is allowed to mention 


other fixed sets in addition to x and y.) We will say that F is y-constructed 
if the condition »(F Ì seg t, F(t)) holds for every t in dom F. 


Example We obtain one instance of transfinite recursion by taking 
y(x, y) to be: y = ran x. Now it is automatically true that for any f there 
is a unique y such that y = ran f. So we are left with the theorem: 


Assume that < is a well ordering on A. Then there exists a unique 
function with domain A such that 


F(t) = ran(F | seg t) 
for all t in A. 


This is the instance of transfinite recursion we will need in order to make 
ordinal numbers. 


Example The preliminary form of transfinite recursion is obtained by 
choosing the formula y(x, y): 
y(x, y) <> either (i) xe <4Band y= G(x) 
or (ii) x¢<4Bandy=@. 
Then for any f there is a unique y such that y(f, y). We can conclude 


from the transfinite recursion theorem that there is a unique function F 
with domain A that is y-constructed: For t € A, either (i) (F | seg t) e <4B 
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and F(t) = G(F [ seg t) or (ii) (F | seg t)¢ “4B and F(t)=@. Since G: 
<4B — B, we can see (by transfinite induction) that we have alternative 
(i) for every t in A. Thus F is G-constructed. 


One momentary drawback to the above statement of transfinite 
recursion is that we are unable to prove it with the axioms now at our 
disposal. Actually this drawback is shared by a number of other statements 
that are intuitively true under our informal ideas about sets. For example, 
we are unable to prove that 
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is a set. In fact we cannot prove the existence of any inductive set other 
than w. These deficiencies will be eliminated in the next section by 
the replacement axioms. 


Exercises 


4. Let < be the usual ordering on the set P of positive integers. For 
n in P, let f(n) be the number of distinct prime factors of n. Define the 
binary relation R on P by 

mRn <> either f(m)<f(n) or [f(m)= f(n)&m<n]. 
Show that R is a well ordering on P. Does <P, Rẹ resemble any of the 
pictures in Fig. 45 (p. 185)? 
5. Assume that < is a well ordering on A, and that f: A > A satisfies the 
condition 

x<y = f(x) <f) 

for all x and y in A. Show that x < f(x) for all x in A. [Suggestion: 
Consider f (f (x)).] 
6. Assume that S is a subset of the real numbers that is well ordered 
(under the usual ordering on reals). Show that S is countable. [Suggestion: 
For each x in S, choose a rational number between x and the next member 
of S, if any.] 
7. Let C be some fixed set. Apply transfinite recursion to œ (with its usual 
well ordering), using for y(x, y) the formula 


y=C vu |]Ų ran x. 


Let F be the y-constructed function on w. 
(a) Calculate F(0), F(1), and F(2). Make a good guess as to what 
F(n) is. 
(b) Show that if a e F(n), then a F(n*). 
(c) Let C = |] ran F. Show that C is a transitive set and that C¢ C. 
(The set Č is called the transitive closure of C, denoted TC C.) 
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REPLACEMENT AXIOMS 


If H is a function and A is a set, then H[A] is a set, simply 
because it is included in ran H. (Lemma 3D guarantees that ran H is a 
set.) But now consider a “function-class” H, i.e., a class of ordered pairs 
that satisfies the definition of being a function, except that it may not 
be a set. Is it still true that H[A] is a set? With the axioms stated 
thus far, we are unable to prove that it is. (The unprovability of this 
can actually be proved; we will return to this point in Chapter 9.) 
But certainly H[A] cannot be too big to be a set, for it is no larger than 
the set A. So if we adopt the principle that what distinguishes the sets 
from the proper classes is the property of being limited in size, then it is 
eminently reasonable to adopt an axiom that, in a way, asserts that 
H[A] is a set. Now the axiom in Zermelo-Fraenkel set theory cannot 
legally refer to a class H; it must instead involve a formula ọ that 
defines H. (You will recall that there was a similar situation in the case 
of the subset axioms.) For each formula that might define a function-class, 
we get an axiom. 


Replacement Axioms For any formula g(x, y) not containing the 
letter B, the following is an axiom: 


VA[ (Vx € A) Vy, Vy2(@(x, y1) & g(x, y) = yy =y) 
= IBYVy(yeB < (axe A) e(x y))}- 


Here the formula g(x, y) is permitted to name sets other than x and y; 
we could have emphasized this fact by writing g(x, y, ti, ..., t,) and 
inserting the phrase “Yt; ++: Vt,” at the beginning of the axiom. But 
v(x, y) must not mention the set B whose existence is being asserted by 
the axiom. 

To translate the replacement axioms into words, define the class 


H = {<x, y) | xe 4 & g(x, y)}. 
Then the hypothesis of the axiom, 
(Vx e A) Vy, Vy2(P(, y1) & plx y) = y, =y} 
asserts that H is a function-class. And the second line, 
IBVy(yeB < (axe A)g(x, y)), 
asserts that if we let 
B = {y | (axe A) ọ(x, y)} = HA} 


then B is a set. (It then follows that H, being included in A x B, is also 
a set.) 
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An alternative way of translating replacement into words is as follows. 
Read (x, y) as “x nominates y.” Then the hypothesis of the axiom 
says, “Each member of A nominates at most one object.” And the 
conclusion says, “The collection of all nominees is a set.” 

The name “replacement” reflects the idea of replacing each x in the 
set A by its nominee (if any) to obtain the set B. 


Example If A is a set, then {Pa | a e A} is also a set, by Exercise 10 
of Chapter 2. But we now have an easy proof of this fact. Take 
(x, y) to be y= 2x. That is, let each x nominate its own power 
set. Then replacement tells us that the collection of all power sets of 
members of A forms a set. 


Example Let S be any set. Then replacement tells us that 
{card x | xe S} 
is a set. We take the formula g(x, y) to be y = card x. 


Proof of the Transfinite Recursion Theorem The proof is similar, in its 
general outline, to the proof of the recursion theorem on w. Again we 
construct the desired function F as the union of many approximating 
functions. For t in A, say that a function v is y-constructed up to t iff 
dom v = {x | x < t} and for any x in dom v, 


y(v f seg x, v(x)). 


1. First we claim that if t, <¢,, v, is y-constructed up to t, and v, 
is y-constructed up to t,, then v,(x) = v,(x) for all x <t,. Should this 
fail, there is a least x < t, with v, (x) # v,(x). By the leastness of x, we have 
vı | seg x = v, | seg x. Also we have 


(o, Fsegx,0,(x)) and (v; l seg x, v9(x)), 


whence by our assumption on y we conclude that v,(x) = v,(x) after all. 
This establishes the claim. . 

In particular, by taking t, = t, we see that for any t € A there is at most 
one function v that is y-constructed up to t. We next want to form the set 
X of all functions v that are, for some t in A, y-constructed up to t: 


KH = {v | (At e A) v is a function y-constructed up to t}. 
X is provided by a replacement axiom. Take for g(t, v) the formula 
v is a function that is y-constructed up to t. 
We have shown that 


teA& o(t,v,)& olt; v) > v =v. 
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Hence by replacement there is a set % such that for any v, 
ve XH <> (Jte A) elt, v) 
<> (3t € A) v is y-constructed up to t. 
Now let F be ().#, the union of all the v’s. Thus 
(sy) <x, yoEF < v(x)=y forsomevin X. 


First observe that F is a function. For suppose that <x, y,> and <x, y,) 
belong to F. By (7x) there exist v,, t}, v,, and t, such that v,(x) =y; and v; 
is y-constructed up to ¢, for i=1, 2. Either x <t, <t, or x<t, < t, 
and in either event we have by our earlier claim y, = v,(x) = v,(x) = y2. 


2. Next we claim that for any xe dom F, y(F | seg x, F(x)). For if 
xe dom F, there exists v in # with x e dom v. Then we have 


y(v | seg x, v(x)) since ve X, 
vfsegx=Ffsegx by (5%) and part 1, 


v(x) = F(x) by (7x), 
from which we conclude that y(F | seg x, F(x)). 


3. We now claim that dom F = A. If this fails, then there is a least 
t e A — dom F. Then seg t & dom F; in fact seg t = dom F. Take the unique 
y such that y(F, y) and let 


v= Fu {tt yo} 


We want to show that v is y-constructed up to t. Clearly v is a function 
and dom v = {x | x < t}. For any x < t we have rT seg x = F | seg x and 
v(x) = F(x), and so by part 2 we conclude that y(v | seg x, v(x)). For 
the case x = t we have v Ì seg t = F and v(t) = y, and so by our choice of 
y we obtain »(v) | seg t, v(t)). Hence v is y-constructed up to t. But this 
implies that te dom F after all. 


4. Finally we claim that F is unique. For suppose that F, and F, 
both satisfy the conclusion of the theorem. We can apply transfinite induc- 
tion; let B be the set on which F, and F, agree: 


B = {t e A | F, (t) = F,(0)}. 
It suffices to show that for any t € A, 
segtoB => teB. 


But this is easy. If seg t S B, then we have F, Ì seg t = F, [ seg t. Also we 
have 
(F, | seg t, F,(t)) and (F, l seg t, F,(t)), 
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whence by our assumption on y we conclude that F,(t) = F,(t). Hence 
te B and we are done. 4 


Exercises 


8. Show that the subset axioms are provable from the other axioms. 
9. Show that the pairing axiom is provable from the other axioms. 


EPSILON-IMAGES? 


Our first application of transfinite recursion will be to the construction 
of the ordinal numbers. Assume that < is a well ordering on A and take 
for y(x, y) the formula: y = ran x. The transfinite recursion theorem then 
presents us with a unique function E with domain A such that for any 
te A: 


E(t) = ran(E [ seg t) 
= Efseg t] 
= {E(x)|x < $. 
Let «= ran E; we will call « the e-image of the well-ordered structure 


<A, <)>. (Later « will also be called an ordinal number, but we want to 
postpone introducing that terminology.) 


Example To get some idea of what the c-image might look like, take the 
three-element set A = {r, s, t}, where r < s < t. Then we calculate 


E(r) = om )|x<= 
o = {E(x) | x < s} = EO ={Ø), 
= e )| x < 8 = {E(r), E(s)} = (8. 1) 
Thus E(r) = 0, no = 1, E(t) = 2, and the €-image « of <A, <)> is 3. You are 


invited to contemplate whether natural numbers will always appear in « like 
this. 


In the above example, « was a set equinumerous with A. Moreover, while 
< was a well ordering on A, the membership relation 


6, ={<x,pyoeaxa|xey} 
provided a well ordering on «. We will show that this always happens, 
whence the name “e-image.” 


2 The membership symbol (e) is not typographically the letter epsilon but originally it 
was, and the name “epsilon” persists. 
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Theorem 7D Let < be a well ordering on A and let E and a be as 
described as above. 


(a) E(t) ¢ E(t) for any te A. 
(b) E maps A one-to-one onto g. 
(c) For any sand t in A, 


s<t iff E(s)e E(t). 
(d) «is a transitive set. 


Proof We prove that E(t) ¢ E(t) by the “least counterexample” method. 
That is, let S be the set of counterexamples: 


S = {te A | E(t) € E(t}. 


We hope that S is empty. But if not, then there is a least ĉe S. Since 
E(î) € E(é), there is (by the definition of E(?)) some s < ê with E(f) = E(s). 
But then E(s) € E(s), contradicting the leastness of f. Hence S = Ø, which 
proves part (a). 

For part (b), it is obvious that E maps A onto «; we must prove that E is 
one-to-one. If s and t are distinct members of A, then one is smaller than the 
other; assume s < t. Then E(s)e E(t), but, by part (a), E(t) ¢ E(t). Hence 
E(s) # E(t). 

In part (c) we have 


s<t = E(s)e E(t) 


by definition. Conversely if E(s) € E(t), then (by the definition of E(t)) there 
is some x < t with E(s) = E(x). Since E is one-to-one, we must have s = x 
and hence s < t. 

Finally for part (d) it is easy to see that æ is a transitive set. If 
u € E(t) e a, then there is some x < t with u = E(x); consequently ue «. 4 


Define the binary relation on a: 
Ee = Kx, yea xa]xe y}. 


Under the assumptions of the preceding theorem, €, is a well ordering on «. 
We will postpone verifying this fact until we can do so in a more general 
setting (see Corollary 7H). Since « is a transitive set, we can characterize 
€, by the condition 


xXxE,y > xeyea 
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Exercises 
10. For any set S, we can define the relation €, by the equation: 
e,={<x,yoeSxS|xey} 


(a) Show that for any natural number n, the €-image of <n, €,> is n. 

(b) Find the e-image of <a, €,,>. 

11. (a) Although the set Z of integers is not well ordered by its normal 
ordering, show that the ordering 


0,1,2,..., —1, —2, —3,... 


is a well ordering on Z. 
(b) Suppose that we define the usual function E on Z, using the well 
ordering of part (a). Calculate E(3), E(— 1), and E(— 2). Describe ran E. 


ISOMORPHISMS 


Theorem 7D told us that a well-ordered structure <A, <» looks a great 
deal like its e-image <a, €,>. We had a one-to-one correspondence E between 
A and « with the order-preserving property: ` 


s<t iff E(s)e E(t) 


for s and t in A. In formalizing this concept of “looking alike,” we need not 
restrict attention to well orderings. Instead we might as well formulate our 
definition in broader terms. 


Definition Consider structures <A, Rẹ and <B, S>. An isomorphism 
from <A, Rò onto <B, SY is a one-to-one function f from A onto B such 
that 

xRy if f(x)Sf(y) 


for x and y in A. If such an isomorphism exists, then we say that <A, R) is 
isomorphic to <B, SY, written: <A, R> = <B, S). 


Example Let < be a well ordering on A and let « be the e-image of 
<A, <>. Then Theorem 7D states that <A, <» is isomorphic to <a, €,>, and 
that E is an isomorphism. 


Example Isomorphic structures really do look alike. We can draw pictures 
(at least in the simple cases) as in Figs. 43 and 44. Figure 45 consists of 
pictures of four well-ordered structures. Figure 45a shows a well ordering 
on {r, s, t}, and Fig. 45b shows a well ordering on {x, y, z}. The two 
structures are isomorphic, and the two pictures are obviously very much 
alike. Figures 45c and 45d provide pictures of infinite structures. Figure 45c 
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3, ID 


<3,0> 


Ed 
oe i — o 
— 

e — o — o —0 -0 


<2, 1> 


<2, 0> 


Ua 


<1, 1> 
<1, 0> 


<0, 1> 


0 e <0, 0) 
(c) (a) 


Fig. 45. Four well orderings, of which two are isomorphic. 


shows the usual ordering on œ. Figure 45d is a picture of œ x œ with the 
“lexicographic” (dictionary) ordering 
<m,n><,<p,q> if mep or (m=p&neq). 


Although @ is equinumerous to w x œ, the pictures do not look alike. For 
example, in Fig. 45c the set seg n is always finite, whereas in Fig. 45d the set 
seg<m, n> is infinite for m # 0. And the two structures are not isomorphic. 


Actually we have used the word “isomorphic” before, e.g., in Theorem 
4H. Both there and here we are using special cases of the general concept of 
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isomorphism of structures. In Theorem 4H we had “structures” <N, S, eò 
consisting of a set N together with a function S: N >N and an element 
ee N. Now we have structures <A, R>, where R is a binary relation on A. 
We will have no need here for greater generality, but you are nonetheless 
invited to devise general concepts of structure and isomorphism. 

We have the easy theorem stating that isomorphism is an “equivalence 
concept.” That is, it obeys laws of reflexivity, symmetry, and transitivity. 


Theorem 7E Let <A, RY, <B, S>, and <C, TY be any structures. Then 
<A, R> = <A, R), 
<A, RD = (BS) = <B, S>% <4A, R), 
<A, RX = <B, Sy =<C,T> = <A, R= <C,T}. 


Proof For the first of these we can use the identity function on A; it is 
an isomorphism of <A, RY onto itself. To prove the second we take the 
inverse of the given isomorphism; to prove the third we take the composition 
of the two given isomorphisms. 4 


If two structures are isomorphic, then for any statement that is true of 
one, there is a corresponding statement that is true of the other. This is 
because the structures look exactly alike. But to nail down this general 
principle in precise terms, we would have to say formally what we meant by 
“true statement” and “corresponding statement.” This would lead us away 
from set theory into logic. Instead we will just list (in Theorem 7G) some 
particular instances of the general principle. 


Lemma 7F Assume that f is a one-to-one function from A into B, 
and that <, is a partial ordering on B. Define the binary relation < , on A 
by 


x<,y iff f(x) <, f(y) 
for x and y in A. 


(a) The relation < | is a partial ordering on A. 
(b) If <, isa linear ordering on B, then < , is a linear ordering on A. 
(c) If <, is a well ordering on B, then < ‚is a well ordering on A. 


Proof The proof is straightforward and is left, for the most part, to the 
exercises. We will give details here for only the following part: If <, is a 
well ordering on B, then any nonempty subset S of A has a least (with 
respect to < ,) element. First note that f[S] is a nonempty subset of B. 
Hence it has a least element, which must be f (m) for some unique m e S. We 
claim that m is least in S. Let x be any other member of S. Then 
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f(x) € f[S], and by the leastness of f(m) we have f(m) <, f(x). In fact, 
f(m) < f(x) since f is one-to-one. Hence m < ,x, which shows that m is 
indeed least in S. 4 


Theorem 7G Assume that structures <4, <,> and <B, <,> are 
isomorphic. If one is a partially (or linearly or well) ordered structure, so 
also is the other. 


Proof Suppose that one is a partially ordered structure; by symmetry 
we may suppose that it is (B, <,>. We are given an isomorphism f of 
<A, < > onto <B, <,>. Thus 


x<yy if f(x)<,f(y) 
for x and y in A. Hence we can apply the above lemma. 4 
Theorem 7D told us that well orderings were isomorphic to their 


€-images. In light of the above theorem, we can now conclude that the 
€-images are well ordered by epsilon. 


Corollary 7H Assume that < is a well ordering on A and let a be the 
e-image of (A, <)>. Then « is a transitive set and e, is a well ordering on a. 


Exercises 


12. Complete the proof of Lemma 7F. 


13. Assume that two well-ordered structures are isomorphic. Show that 
there can be only one isomorphism from the first onto the second. 


14, Assume that <A, <> isa partially ordered structure. Define the function 
F on A by the equation 


F(a) ={xeA Ix < a}, 


and let S = ran F. Show that F is an isomorphism from <A, <> onto 
Sc s> 


ORDINAL NUMBERS 


It will be extremely useful in our discussion of well orderings if we can 
assign a “number” to each well-ordered structure that measures its “length.” 
Two well-ordered structures should receive the same number iff they look 
alike, i.e., iff they are isomorphic. The situation is much like that in Chapter 6, 
where we assigned cardinal numbers to sets so as to measure their size. Two 
sets received the same cardinal number iff they were equinumerous. The key 
to the present situation lies in the following theorem. 
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Theorem 71 Two well-ordered structures are isomorphic iff they have 
the same €-image. That is, if <, and <, are well orderings on A, and 
A,, respectively, then <A,, <,>=<A,, <,) iff the c-image of Ay, <> 
is the same as the €-image of <A}, <,). 


Proof In one direction, this follows at once from the fact that well- 
ordered structures are isomorphic to their €-images. That is, if<A,, <;> and 
<A,, <3) have the same €-image a, then 


CA <> = Ca, E) = KA,, <>. 


For the other direction, assume that f is an isomorphism from <A p<. 
onto (A,, <,>. Let E, : A, > a, and E,: A, > a, be the usual isomorphisms 
of the well orderings onto their €-images: 


E,(s) = {E,(x)|x<,s} and E,,(t) = {E,(y) |y <38. 


Then we can use transfinite induction to show that E,(s) = E,(f(s)) for all 
s€ A,. For let B be the set on which this equation holds: 


B= {se A, | E,(s) = E,(f(s))}. 
Suppose that seg s S B. Then we calculate 
E,(s) = {E,(x)| x <; s} by definition 
= {E,(f(x))|x<,s} since segs B 
= {E,(y)|y<2f(s)} (see below) 
= E,(f(s)). 
The third line on this calculation requires some thought. Clearly 
(ESO) |x <15} E {Ex() [y <2 FC) 


because we can take y = f (x). Conversely, the other inclusion holds because 
J maps onto A, and therefore any y less than f(s) must equal f (x) for some 
(unique) x less than s. 

Thus if seg s S B, then E,(s) = E,(f(s)) and so s e B. By the transfinite 
induction principle, B = A,, i.e., 


for all s € A,. Consequently, 
a, = {E (s)| s€ Ay} 
= {E,(f(s)) | s€ Ay} 
= {E,(t)|te A,} since f maps onto A, 
=O. 


Thus the €-images coincide. 4 
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The solution to our problem of how to assign “numbers” to well 
orderings is provided by the foregoing theorem. We just use €-images. 


Definition Let < be a well ordering on A. The ordinal number of 
<A, <> is its e-image. An ordinal number is a set that is the ordinal number 
of some well-ordered structure. 


(More generally, we could seek to define an “order type” ot¢A, R> 
whenever R was a partial ordering on A. The order type should measure 
the “shape” of R, in the sense that 


ot¢(A, R = ot<B, S iff <A, RD = <B, S). 


This can be done; see Exercise 32. But the methods that are required are 
quite unlike the methods used for ordinal numbers.) 

We can generate examples of ordinal numbers by calculating ¢-images. 
The example preceding Theorem 7D shows that 3 is an ordinal number. 
And study of Fig. 45c will show that œ is an ordinal number. More examples 
will appear presently. First we need some additional information about well 
orderings. 

If < is any ordering on A and C is a subset of A, then the elements of 
C are, of course, still ordered by <. Now it is not strictly true that < is 
an ordering on C, because (if C # A) < is not a binary relation on C. 
But < ^ (C x C) is an ordering on C. This notation is excessively cumber- 
some for the simplicity of the ideas involved. And so we define 


<C, <° = <C, < N^ (C XC). 


The symbol ° reminds us that the relation < can be cut down to fit the 
set C. In particular, if C = seg c, then we have the ordered structure 


<seg c, <°). 


Theorem 7J Assume that < isa partial ordering on A, and that C & A. 
Then <° is a partial ordering on C. This continues to hold with “partial” 
replaced by “linear” or “well.” 


Proof This is an immediate consequence of Lemma 7F, wherein we 
take f to be the identity function on C. 4 


The following theorem is a sort of trichotomy law for well-ordered 
structures. 


Theorem 7K For any two well-ordered structures, either they are 
isomorphic or one is isomorphic to a segment of the other. More precisely, 
let < , and <, be well orderings on A and B. Then one of the following 
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alternatives holds: 
<A, <> = <B, < p), 
<A, < 4> = <seg b, <9 for some be B, 
<seg a, <3) = <B, < p for some ae A. 


Proof The idea is to start pairing elements of A with elements of B in 
the natural way: We pair the least elements of the two sets, then the next 
smallest, and so forth. Eventually we run out of elements on one side or the 
other. If the two sets run out simultaneously, then <A, < ,> = <B, < p). 
Otherwise one set runs out first; in this case it is isomorphic to a segment 
of the other. 

The pairing is obtained by transfinite recursion. Let e be some extraneous 
object belonging neither to A nor to B. By transfinite recursion there is a 
unique function F: A > B u {e} such that for each te A, 


F(t) = the least element of B — F[seg t] if any, 
je if B — F[seg t] is empty. 
Case I eceran F. Leta be the least element of A for which F(a) = e. 
We claim that F f sega is an isomorphism from <sega, < `> onto 
<B, <,>. Let F° = F Ì seg a. 
Clearly F°: seg a > B, and its range is all of B since B — F[seg a] = Ø. 
We have 
xXx<4y<4a = Ffseg x] F[seg y] 
= B— F[seg y] = B — Flseg x] 
=> F(x) <, F(y). 
Furthermore if x < , y < 4a then F(x) # F(y), because F(x) e F[seg y] but 
by construction F(y) ¢ F[seg y]. Therefore F° is one-to-one, and in fact 
X<4y<4a = F(x) <, Fly). 
Consequently, F° preserves order, since 


F(x)<,F(y) = Fly) #, F(x) 
=> yx 
> x<» 
Thus F° is an isomorphism as claimed. 
Case II ran F=B. Then F: A -> B,and just as in Case I (but without 


the a) F is one-to-one and preserves order. Thus F is an isomorphism from 
<A, <,> onto <B, <,>. 
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Case III Otherwise ran F is a proper subset of B. Let b be the least 
element of B — ran F. We claim that ran F = seg b. By the leastness of b, we 
have seg b © ran F. Conversely, we cannot have b < , F(x) for any x, because 
F(x) is least in B — F[seg x], a set to which b belongs. Hence ran F = seg b. 

As in Cases I and II, F is one-to-one and preserves order. Hence F is 
an isomorphism from <A, < ,> onto <seg b, <>. 4 


We now return to the study of ordinal numbers. The next theorem will 
show that any ordinal number is its own €-image. To extract the maximum 
of information from the proof, the following definition will help. 


Definition A set A is well ordered by epsilon iff the relation 
e =x, ype Ax A]xey} 
is a well ordering on A. 


We have shown that any ordinal number is a transitive set that is well 
ordered by epsilon (Corollary 7H). The converse is also true; the ordinal 
numbers are exactly the transitive sets that are well ordered by epsilon: 


Theorem 7L Let « be any transitive set that is well ordered by epsilon. 
Then « is an ordinal number; in fact « is the e-image of <a, €,>. 


Proof Let E be the usual function from a onto its e-image. We can 
use transfinite induction to show that E is just the identity function on a. 
Note that for te a, 


xet => xeE,t 


because « is a transitive set. As a consequence we have seg t = t. 
If the equation E(x) = x holds for all x in seg t, then 


E(t) = {E(x) | x €, t} 
{x |x €, t 


| 
wn 
o> 
gq 
~ 


Hence by transfinite induction, E is the identity function on «, so the e-image 
of <a, €,> is just « itself. 4 


The class of all ordinal numbers is (as we will prove shortly) too 
large to be a set. But except for not being a set, it has many of the 
properties of the ordinal numbers themselves. The next theorem shows that 
the class of all ordinal numbers is a transitive class well ordered by epsilon. 
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Theorem 7M _ The following are valid for any ordinal numbers a, $, 
and y. 


(a) (transitive class) Any member of « is itself an ordinal number. 
(b) (transitivity) ae Bey>aecy. 
(c) (irreflexivity) a¢a. 

(trichotomy) Exactly one of the alternatives, 


ae B, a= B, Bea, 


holds. 
(e) Any nonempty set S of ordinal numbers has a least element u 
(i.e, we a for every a e S). 


Proof For part (a), consider any x in a. Now « is the €-image of some 
well-ordered structure <A, <}. So x = E(t) where E is the usual isomorphism 
and t is some member of A. We will prove that x is an ordinal by 
showing that it is the e-image of <segt, <°>. By Theorem 7J the 
restriction <° is a well ordering on seg t. The €-image of <seg t, <°) is 
easily seen to be E[seg t], but this is just E(t), which is x. Hence x is an 
ordinal. 

Part (b) is true because the ordinal y is a transitive set. 

Part (c) follows from Theorem 7D(a). If « e a, then « = E(t) for some t 
(in the notation used above in proving part (a)). But then E(t) e E(t), which 
contradicts Theorem 7D(a). 

In trichotomy, the “at most one” part follows from transitivity and 
irreflexivity (as in Theorem 7A). The “at least one” half is a consequence of 
Theorem 7K, which tells us that either <a, €,> and <£, €,> are isomorphic, 
or else one is isomorphic to a segment of the other. 


Case I <a, €> = <B, €,>. Then both have the same e-image, by 
Theorem 7I. But ordinals are their own e-images by Theorem 7L, so a = $. 


Case II By symmetry we may suppose that <a, €,> is isomorphic to 
<seg 6, €,>, where d€f. But then 6 is an ordinal, segd=6, and 
€, = €. Thus we have 


<a, E> = LÔ, 5). 
Now by Case I, x = 6. Hence « e $. 
For part (e), first consider some arbitrary $ € S. If B a S = @, then we 
claim that £ is least in S. This is because 
«ES => a€B = Bea 


by trichotomy. There remains the possibility that B ^ S # Ø. Then we have 
here a nonempty subset of $. Hence $ ^ S has a least (with respect to Ep) 


Ordinal Numbers 193 


member u. We claim that u is least in S. To verify this, consider any 
aeS: 


a¢B = Beas Bou > pes, 
aep => a€BnS = pea. 
Either way, u g a. 4 


Corollary 7N (a) Any transitive set of ordinal numbers is itself an 
ordinal number. 

(b) 0 is an ordinal number. 

(c) Ifa is any ordinal number, then a* is also an ordinal number. 

(d) If A is any set of ordinal numbers, then {JA is also an ordinal 
number. 


Proof (a) We can conclude from the preceding theorem that any set of 
ordinals is welt ordered by epsilon. If, in addition, the set is a transitive 
set, then it is an ordinal number by Theorem 7L. 

(b) @ is a transitive set of ordinals. (It is the e-image of the well- 
ordered structure (Ø, ØX.) 

(c) Recall that «* = « U {a}. The members of «* are all ordinals (by 
Theorem 7M(a)). And «* is a transitive set by Theorem 4E (compare 
Exercise 2 of Chapter 4). 

(d) We can also apply part (a) to prove part (d). Any member of JA 
is a member of some ordinal, and so is itself an ordinal. To show that 
| JA is a transitive set, we calculate 


de\JA = Seaed for some a 
=> daaed since a is transitive 
=> ôT (JA. 
Thus by part (a), | JA is an ordinal. 4 


In fact more is true here. If A is any set of ordinals, then (JA is an 
ordinal that is the least upper bound of A. To verify this, first note that 
the epsilon ordering on the class of all ordinals is also defined by c. That 
is, for ordinals æ and f we have: 


aep > acB since f is a transitive set and a # B 


=> pea lest Be B 


=> «eß by trichotomy. 


Thus we $ iffa c £. Since |] A is the least upper bound for A in the inclusion 
ordering, it is the least upper bound for the epsilon ordering. 
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In the same vein, we can assert that «* is the least ordinal larger than a. 
Clearly «e «*, so a” is larger than a. Suppose «e ß for some ordinal $. 
Then as c ß as well, and hence a © $. As noted above, this yields the fact 
that a* e $. This shows that «* is the least ordinal larger than «. 

Observe also that any ordinal is just the set of all smaller ordinals. That is: 


a ={x|xea} 
= {x | x is an ordinal & x € a}. 


Since our ordering relation is €, this can be read as, “a equals the set of 
ordinals less than «.” 

At long last we can get a better picture of what the ordinals are like. The 
least ordinal is 0. Next come 0*, 0**,..., ie., the natural numbers. The 
least ordinal greater than these is the least upper bound of the set of 
natural numbers. This is just |_)w, which equals œw. (Or more directly: œ is 
an ordinal and the ordinals less than œ are exactly its members.) Then come 
wt, wt*, .... All of these are countable ordinals (i.e., ordinals that are 
countable sets). The least uncountable ordinal Q is the set of all smaller 
ordinals, i.e., is the set of all countable ordinals. But here we are on thin 
ice, because we have not shown that there are any uncountable ordinals. 
That defect will be corrected in the next section. 


Burali-Forti Theorem There is no set to which every ordinal number 
belongs. 


Proof Theorem 7M told us that the class of all ordinals was a transitive 
class well ordered by epsilon. If it were a set, then by Theorem 7L it 
would be an ordinal itself. But then it would be a member of itself, and no 
ordinal has that property. 4 


This theorem is also called the “Burali-Forti paradox.” Like the later 
paradox of Russell, it showed the inconsistency of Cantor’s casual use of 
the abstraction principle, which permitted speaking of “the set of all 
ordinals.” The theorem was published in 1897 by Burali-Forti, and was the 
first of the set-theoretic paradoxes to be published. 


Exercises 


15. (a) Assume that < is a well ordering on A and that t e A. Show that 
<A, <)> is never isomorphic to <seg t, <°). 
(b) Show that in Theorem 7K, at most one of the three alternatives 
holds. 


16. Assume that « and f are ordinal numbers with «e ß. Show that 
a*t e Bt. Conclude that whenever « # p, then a* # p+. 
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17, Assume that <A, <> is a well-ordered structure with ordinal number 
a and that BC A. Let $ be the ordinal number of the well-ordered 
structure <B, <°» and show that fe a. 


18. Assume that S is a set of ordinal numbers. Show that either (i) S has 
no greatest element, | ]S ¢ S, and ()S is not the successor of another 
ordinal; or (ii) (JS e S and ()S is the greatest element of S. 


19. Assume that A is a finite set and that < and < are linear orderings 
on A. Show that <A, <> and <A, <> are isomorphic. 


20. Show that if R and R`! are both well orderings on the same set S, 
then S is finite. 


21. Prove the following version of Zorms lemma. Assume that < is a 
partial ordering on A. Assume that whenever C is a subset of A for which 
<° is a linear ordering on C, then C has an upper bound in A. Then there 
exists a maximal element of A. . 


DEBTS PAID 


In this section we will pay off the debts incurred in Chapter 6. We will 
define cardinal numbers as promised, and we will complete the proof of 
Theorem 6M. 

The first theorem assures an adequate supply of ordinal numbers. Recall 
that B is dominated by A (B X A) iff B is equinumerous to a subset of A. 


Hartogs’ Theorem For any set A, there is an ordinal not dominated by A. 


Proof There is a systematic way to construct the least such ordinal g. 
Any ordinal that is dominated by A is smaller than (i.e., is a member of) «. 
And conversely, any ordinal smaller than (i.e., a member of) a should be 
dominated by A, if « is to be least. Hence we decide to try defining 


a = {£ | B is an ordinal & B X At}. 


The essential part of the proof is showing that « is a set. After all, the 
theorem is equivalent to the assertion that « is not the class of all ordinals. 
We begin by defining 


W ={<B, <>|BOA& <isa well ordering on B}. 


We claim that any member of « is obtainable as the €-image of something 
in W. (Thus « is, in a sense, no larger than W.) Consider any ordinal $ 
in æ. Then some function f maps B one-to-one onto a subset B of A. There 
is a well ordering < on B such that f is an isomorphism from <2, €,> onto 
<B, <>, namely 


< ={¢F (x), Fy) | x€ ye B}. 
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Thus <B, <) € W and the e-image of <B, <> is the same as the e-image 
of <P, €p)» which is just £. Thus £ is indeed the c-image of a member of W. 


W is a set, because if <B, <> W, then 
(B, <E PA x P(A x A). 


We can then use a replacement axiom to construct the set & of e-images 
of members of W. The preceding paragraph shows that «© &. (In fact it 
is not hard to see that a = &.) 


Since « is a set, we can conclude from the Burali-Forti theorem that some 
ordinals do not belong to «. This proves the theorem, but we will continue 
and show that « itself is the least such ordinal. Clearly « is a set of ordinals, 
and since 

yepea > ySBXA > yee 


a is transitive. Hence « is an ordinal. We could not have «<A, lest « ` 
belong to itself. And « is the least such ordinal, since Be «= BA by 
construction. 


Hartogs’ theorem does not require the axiom of choice. But the following 
theorem does. , 


Well-Ordering Theorem For any set A, there is a well ordering on A. 


This theorem is often stated more informally: Any set can be well 
ordered. As we will observe presently, the well-ordering theorem is equivalent 
to the axiom of choice. We can prove the well-ordering theorem from 
Zorn’s lemma (Exercise 22). But in order to complete the proof of 
Theorem 6M, we will instead use the axiom of choice, IH. 


Proof Let G be a choice function for A, and let « be an ordinal not 
dominated by A (furnished by Hartogs’ theorem). Our strategy is to order 
A by first choosing a least element, then a next-to-least, and so forth. For 
the “and so forth” part, we use « as a base; « is large enough that we 
will exhaust A before coming to the end of «. 

Let e be an extraneous object not belonging to A. We can use transfinite 
recursion to obtain a function F: « —> A vu {e} such that for any ye @, 


G(A-F[y]) if 4-Fly)]4#2, 

e if A—Fly]=@. 

In other words, F(y) is the chosen member of A — F[y], until A is exhausted. 
Suppose y€ fe «. If neither F(y) nor F(f) equals e, then F(y) 4 F(B), 

since F(y) € F[B] but F(8) ¢ F[B]. That is, F is one-to-one until e appears. 

It follows that e€ ran F, lest F be a one-to-one function from a into A, 

contradicting the fact that A does not dominate a. 


Fo) = 
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Let ô be the least element of « for which F(ô) = e. Then F[6] = A. And 
by the preceding paragraph, F [ 6 is a one-to-one map from 6 onto A. 
This produces a well ordering on A, namely 


< = {(F(B), F(v)> | Beye 5}. 
Then F | 6 is an isomorphism from <ô, €,> onto <A, <)>. 4 


Example The well-ordering theorem claims that there exists a well 
ordering W on the set R of real numbers. Now the usual ordering on R is 
definitely not a well ordering, so W is some unusual ordering. What is it? 
We have a theorem asserting the existence of W, but its proof (which used 
choice) gives no clues as to just what W is like. It is entirely possible that 
there is no formula of the language of set theory that will explicitly define 
a well ordering on R. 


The following variant of the well-ordering theorem follows easily from it. 


Numeration Theorem Any set is equinumerous to some ordinal 
number. 


Proof Consider any set A; by the preceding theorem there is a well 
ordering < on A. Then A is equinumerous to the €-image « of <A, <)>. 
4 


The name “numeration theorem” reflects the possibility of “counting” 
the members of A by using the ordinal numbers in «, where a “counting” is 
a one-to-one correspondence. 

The numeration theorem produces a satisfactory definition of cardinal 
number. 


Definition For any set A, define the cardinal number of A (card A) to 
be the least ordinal equinumerous to A. 


The following theorem makes good our promise of Chapter 6. 
Theorem 7P (a) For any sets A and B, 
card A = card B iff AB. 


(b) For a finite set A, card A is the unique natural number equi- 
numerous to A. 


Proof First, note that by our definition, A ~ card A for any set A. 
Hence if card A = card B, then we have 


A x card A = card B x B. 


Conversely if A ~ B, then A and B are equinumerous to exactly the same 
ordinals. Hence the least such ordinal is the same in both cases. 
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Part (b) follows from the fact that the natural numbers are also ordinal 
numbers. If A is finite, then (by definition) A ~n for a unique natural 
number n. A is not equinumerous to any smaller ordinal, since the smaller 
ordinals are just the natural numbers in n. Hence card A = n. 4 


Proof of Theorem 6M, concluded Figure 46 shows how Fig. 40 is to be 
extended. Here “WO” is the well-ordering theorem, which we proved from 
(3). It remains to establish (5) =» WO and WO = (6). 


Proof of (5)= WO Given any set A, we obtain from Hartogs’ theorem 
an ordinal « not dominated by A. By (5), we have A x g; i.e., there is a 


6 


yN 


5 1 ————> 


wo 
Fig. 46. The proof of Theorem 6M. 


one-to-one function f from A into «. This induces a well ordering of A; 
define 


s<t < f(s)ef() 
for s and t in A. Then < is a well ordering of A (by Lemma 7F). 


Proof of WO = (6) Consider any s that is closed under unions of 
chains. By WO there exists a well ordering < of Z. We want to make a 
very large chain @ © such that |_)¢ is a maximal element. The idea is to 
go through the members of in order, adding to @ any member of s 
that includes each set previously placed in @. This construction uses trans- 
finite recursion. That is, the transfinite recursion theorem gives us a function 
F: sf — 2 such that for any A € <, 


1 if A includes every set B < A for which F(B) = 1, 
0 otherwise. 


Then let € = {A € wo | F(A) = 1}. Thus F is the characteristic function of @, 
and for Ae £, 


Aé@ < Ai includes every B < A for which Be @. 


F(A) = 


We claim that | )@ is a maximal element of £. 


Debts Paid l 199 


First of all, @ is a chain. For if A and B are in @, then 
A<B = ASB and B<A => BSA. 


Since @ is a chain, |)@esx#. To prove maximality, suppose that 
(J€ S De æ. Then De @, since D includes every member of @. Hence 
DE |Jg, and so D=\J@. Thus |)@ is not a proper subset of any 
member of £. 4 


There are several easy consequences of our definition of cardinal 
numbers. Say that an ordinal number is an initial ordinal iff it is not 
equinumerous to any smaller ordinal number. Then any initial ordinal is 
its own cardinal number. Conversely any cardinal number must, by the 
definition, be an initial ordinal. Thus the cardinal numbers and the initial 
ordinals are exactly the same things. 

Next consider any cardinal number, say x = card S. Then x % S (and x 
is the least ordinal with this property). Also card x = x, since x is an initial 
ordinal. That is, x is itself a set of cardinality x. 

The cardinal numbers, like any other ordinal numbers, are well ordered 
by the epsilon relation as described in Theorem 7M. It is now easy to 
verify that this ordering agrees with the ordering of cardinal numbers defined 
in Chapter 6. For any cardinals «x and A, 


kE > KOA 


> KXIA&K#FA 
> KIÀ&KŁÀA 
=> K<A, 
Conversely, 
KéA => AEK 
=> A<K by the above 
=> KA, 


In particular, any nonempty set of cardinal numbers has a smallest element. 
And the word “smallest” is used unambiguously here, since the epsilon 
ordering and the cardinality ordering agree. The smallest cardinal after 
Xo is of course denoted as %,. We can also characterize %, as the least 
uncountable ordinal. Next in order of magnitude comes &,. And this 
continues. In Chapter 8, we define X, for every ordinal «. 


Exercises 


22. Prove the well-ordering theorem from the version of Zorn’s lemma 
given in Exercise 21. 
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23. Assume that A is a set and define (as in Hartogs’ theorem) « to be 
the set of ordinals dominated by A. Show that (i) æ is a cardinal number, 
(ii) card A < a, and (iii) is the least cardinal greater than card A. 


24. Show that for any ordinal number «, there is a cardinal number xk 
that is (as an ordinal) larger than «. 


25. (Transfinite induction schema) Let g(x) be a formula and show that 
the following holds: 


Assume that for any ordinal «, we have (Vx € a)p(x) > g(a). Then g(a) 
for any ordinal a. 


RANK 


In this final section of the chapter we want to return to the informal 
view of sets mentioned in Chapter 1—and especially to Fig. 3. The description 
that was somewhat vague at the start of this book can now be made quite 
precise. 

We want to define for every ordinal number « the set V,. V is to be 
empty, and, in general, V, is to contain those sets whose members are all 
in some V; for $ less than «. Thus we want 

aeVv, e asy for some f € a 


= ac, for some $ € a, 


or equivalently, 
v, = (JIP V; | Be o). 


Theorem 7U will show that in particular cases, this equation can be simplified. 
If « = ô+, the equation will become 


z = PV, 


whereas if w is not the successor of another ordinal, then the equation will 
become 


v, = Oth, | Be a 
= |] V. 
Bea 
Thus V, = PPPOandVi,=WuUuWYuUVu-. 
But we are getting ahead of ourselves. We have yet to show how to 
define V,. The equation 


V, = {P V; | Be a} 
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defines V, in terms of V, for p less than «. So we need to define V, by 


a 


transfinite recursion. The class of all ordinals is well ordered, so transfinite 
recursion over this class should give us a function-class F such that 


F(x) = {PF(B) | Be a} 


for each ordinal «. But there is an obvious hitch here. We cannot work 
directly with classes that are not sets. Instead we will sneak up gradually 
on F. 


Lemma 7Q For any ordinal number 6 there is a function F, with 
domain 6 such that 


F(x) = J{PF,(B) | Be a} 
for every « e ô. 


Proof We apply transfinite recursion. On ô we have the well ordering 
€,. Take for y(x, y) the formula 


y = {Pz | ze ran x}. 


We can show that {Pz | z € ran x} is a set by showing that it is included 
in PP|_](ran x). But an easier proof uses a replacement axiom: Let 
(z, w) be the formula w=#z. Then because ranx is a set, 
{Pz | ze ran x} is also a set. 


Obviously for any f there is a unique y such that y(f, y), namely 
y = {Pz | z € ran f}. Transfinite recursion then gives us a function F, such 
that 


(sx) F(a) = {Pz | ze ran(F, | seg a)} 
for every « e 6. Now then, 
sega = {t | t €; a} 

= {t|tea} 

= 0. 
Thus 

zeran(F,[ sega) < zeran(F, | a) 
< z= FB) for some $ € a. 
So (sc) becomes 
F(a) = \:){Pz | z = F,(B) for some B € a} 
= U{PF,(B) | Be a} 


as desired. 4 
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Now consider any two ordinals ô and e. The smaller of the two is 
6 N £. (Why?) We have two functions F, and F, from the preceding lemma. 
These functions agree on ôN e: 


Lemma 7R Let 6 and £ be ordinal numbers; let F, and F, be functions 
from Lemma 7Q. Then 
F(a) = F, (a) 
for allacdne. 


Proof By the symmetry, we suppose that d€«. Hence ôS e and 
ô N^ & = 6. We will establish the equation F,(«) = F,(«) by using transfinite 
induction in <ô, €,>. Define 


B = {x € 6|F,(a) = F,(a)}. 
In order to show that B = 6, it suffices to show that B is “e,-inductive,” 
Le., that 
segacB =» «eB 


for each ae ô. 
We calculate: 


segas B => F,(p)=F,(f) for Bea 
=> (PF, (B)| Be a} = U{PF,(B) | Be a} 
=> F(a) = F,(a) i 
=> «eB 
for «e 6. And so we are done. 4 


In particular (by taking ô = £) we see that the function F, from Lemma 
TQ is unique. We can now unambiguously define V,. 


Definition Let « be an ordinal number. Define V, to be the set F,(«), 
where ô is any ordinal greater than a (e.g., ô = «*). 


Theorem 7S For any ordinal number g, 
V, = (IPV; | Be o) 


Proof Let é6=a*. Then V, = F,(a) and V, = F,(8) for B € a. Hence the 
desired equation reduces to Lemma 7Q. 4 


Lemma 7T For any ordinal number a, V, is a transitive set. 


Proof We would like to prove this by transfinite induction over the 
class of all ordinals. This can be done by utilizing Exercise 25. But we can 
also avoid that exercise by proving for each ordinal 6 that V, is a transitive 
set whenever « € ô. This requires only transfinite induction over <ô, €,>. 
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Let B be the set on which the conclusion holds: 
B = {aed | V, is a transitive set}. 


We want to show that B = 0; this will follow from the transfinite induction 
theorem if we show that B is “e,-inductive,” i.e., that 


acB => «eB 
for all x e ô. (Recall that seg « = a.) 


Suppose then that « = B. Then for any f € a, V, is a transitive set, as is 
PV, (by Exercise 3 of Chapter 4). It follows that V, is a transitive set, because 


xe% => xe?V, for some fea 
=> xS2?V, for some f € « 
=> xcV,. 
Thus « € B, and we are done. 4 


There are three sorts of ordinal number. First, there is 0, called zero. 
Second, there are the ordinals of the form «* for a smaller ordinal «. These 
are called the successor ordinals. Third, there are all the others, called limit 
ordinals. For example, the least limit ordinal is œ. 

If A is a limit ordinal and $ e A, then B* € A. This is because f+, being 
the least ordinal larger than $, must satisfy B* e 2. And B* 4A since A is a 
limit ordinal. 

The following theorem describes V, for each of the three sorts of ordinal 
number. 


Theorem 7U (a) For ordinals 8 € « we have V, S V,. 

(b) n =Ø. 

(c) V,+ = AY, for any ordinal number «. 

(d) V,=Use, V; for any limit ordinal 4. 

Proof For part (a), first observe that V, € PV, S V,, so that V, e V,. 
Since V, is a transitive set, we also have V, S V,. 

Part (b) is clear. 

For part (c), we first obtain from part (a) 


pea > V,CV => ZPV SE 2V.. 
B f a 


Now V,+ = (J{PV; | 2 € a} and this union is equal to the largest set, PV,. 
Finally for part (d) we have one inclusion: 
xeV, = xe?V for some Be A 
=> xeVv by part (c) 


= xe |] V since B* e A. 
Bea 
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The other inclusion is similar: 


xe (V = xeV,SV,.=P7V,  forsome fed 


Thus equality holds. 4 


We will say that a set A is grounded iff A S V, for some ordinal number «a. 
In this event, we define the rank of A (rank A) to be the least such ordinal «. 
Thus for a grounded set A: 


Ag Vank A and Ae Virank A)*> 
The number rank A indicates how many times the power set operation must 
be used to obtain all the members of A. For example, every ordinal « is 
grounded and rank g = a (Exercise 26). 
There is one methodological objection that can be raised concerning the 
definition of rank. We want rank A to be the least member of {a | A S V}. 
But (for a grounded set A) this is not a set of ordinals, but a proper class of 


ordinals. Nonetheless, it has a least member. Being nonempty, it has some 
member £$. Either £ is its least member, or 


lep] AEV) 


is a nonempty set of ordinals. In the latter case, we can apply Theorem 7M (e) 
to obtain a least element. 


Theorem 7V (a) If ae A and A is grounded, then a is grounded and 
rank a € rank A. 
(b) If every member of A is grounded, then A is also grounded and 
rank A = | {(rank a)* | ae 4}. 


Proof For the first part, assume that A is grounded and that ae A. 
Then 


AS Vana > aE Viank A 
=> ac?V for some f € rank A 
= ach, for some f € rank A 


=> ranka e rank A. 
For part (b), assume that every member of A is grounded. Define 
a =| ){(rank a)* | ae A}. 


To show that « is a set, we use a replacement axiom first to conclude 
that {(rank a)* | a e A} is a set. Then by the union axiom, « is a set. 
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By Corollary 7N(d), œ is an ordinal. First we claim that Ac V,. 
(This will show that A is grounded and that rank A e «.) We calculate 
acA > aS V anka 


= aE Vrank a)t 


=> aeV, by Theorem 7U(a), 


since (rank a)* e a for each ae A. Thus A € Vy. 
To show that « e rank A, we use part (a). 


acA = rank aerank A 
= (rank a)* € rank A. 


Hence rank A is an upper bound for the ordinals (rank a)*, and so is at least 
as large as their least upper bound «. 4 


Part (b) of the foregoing theorem can be translated into words as: 
The rank of a set is the least ordinal that is strictly greater than the ranks 
of all members of the set. 


Theorem 7W The following two statements are equivalent. 


(a) Every set is grounded. 
(b) (Regularity) Every nonempty set A has a member m with 
mn A= Ø. 


Proof First assume that every set is grounded, and let A be a nonempty 
set. The idea is to take me A having the least possible rank u. Thus from 


{rank a | ae A}, 


we take the least member u, and then we select me A having rank m = p. 

If xem, then by Theorem 7V(a), rank xe u. We cannot have xe A, 
due to the leastness of u. Hence m ^ A = Ø as desired. 

For the converse, assume that (b) holds. Suppose that, contrary to our 
expectations, some set c is not grounded. Then some set (e.g., {c}) has a 
nongrounded member. And hence some transitive set B has a nongrounded 
member; we can take B to be the transitive closure of {c} (see Exercise 7). 

Let A = {x e B | x is not grounded}. Since A is nonempty, by (b) there 
is some me A with m^ A =Ø. We claim that every member of m is 
grounded. If xem, then xe B since B is transitive. But x ¢ A because 
mn A= Ø, so x must be grounded. 

Since every member of m is grounded, we can conclude from Theorem 
7V(b) that m is also grounded. This contradicts the fact that me A. 4 
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Near the bottom of p. 8, we stated as “a fundamental principle” 
that every set is grounded. So on an informal level at least, we must accept 
both parts of the above theorem as true. 


But neither part is provable from the axioms adopted up to now. 
To correct this defect, we will now adopt the regularity axiom. For this 
axiom we could use either (a) or (b) from the above theorem. We select 
(b) because it has a more elementary formulation. 


Regularity Axiom Every nonempty set A has a member m with 
MAA=. 


The regularity axiom is also known as the foundation axiom or the 
Fundierungsaxiom. The idea first appeared in a 1917 paper by Mirimanoff; 
the axiom was listed explicitly in von Neumann’s 1925 paper. 


The concept of rank, as well as the concept of regularity, appeared in 
Mirimanoff’s 1917 paper. The idea of rank is a descendant of Russell’s 
concept of type. 

The following theorem lists some basic consequences of regularity. 


Theorem 7X (a) No set is a member of itself. 
(b) There do not exist any sets a and b with a € b and bea. 
(c) There is no function f with domain w such that 


re f(2)e Fle f(0). 


Proof For part (c), suppose to the contrary that f (n*)e f(n) for each 
new. Let A=ranf. Any member m of A must equal f(n) for some n. 
Then f(n*)e mn A, so that mn A is always nonempty. This contradicts 
regularity. 

Parts (a) and (b) can be proved either by similar arguments or as 
consequences of part (c). We leave the details as an exercise. 4 


Since every set is grounded, the sets are arranged in an orderly 
hierarchy according their rank. This is the situation that Fig. 3 attempts 
to illustrate. Thus the universe of all sets is, in a sense, determined by 
two factors: 


1. the extent of the class of all ordinals, and 
2. the variety of subsets that are assigned to a set by the power set 
operation. 
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Exercises 


26. Show that every ordinal number « is grounded, and that rank « = a. 


27. Show that the set R of real numbers, as constructed in Chapter 5, 
has rank mt*+*++*++t 


28. Show that 
V, = {X|rank X e a}. 


29. Prove parts (a) and (b) of Theorem 7X. 
30. Show that for any sets 


rank{a, b} = max(rank a, rank b)*, 
rank Pa = (rank a)*, 


rank | Jae rank a. 


31. Define kard A to be the collection of all sets B such that (i) A is 
equinumerous to B, and (ii) nothing of rank less than rank B is equinumerous 
to B. 

(a) Show that kard A is a set. 

(b) Show that kard A is nonempty. 

(c) Show that for any sets A and B, 


kard A = kard B iff AB. 


32. Let <A, R> be a structure. Define the isomorphism type it<A, R> of 
this structure to be the set of all structures <B, S> such that (i) (A, RY is 
isomorphic to <B, S>, and (ii) nothing of rank less than rank<B, SY is 
isomorphic to <B, SY. 

(a) Show that it(A, R) is a set (and not a proper class). 

(b) Show that it(A, Rẹ is nonempty. 

(c) Show that 


it(A, R) =it¢B,S> iff <A, R> = <B, S). 


33. Assume that D is a transitive set. Let B be a set with the property 
that for any a in D, 


aScB => aeB. 


Show that Dc B. 
34. Assume that 


{x {x YY} = {u {u, of} 


and show that x = u and y =v. 
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35. Show that if a* = b*, then a = b. 


36. Show that the rank of any set S is the same as the rank of its 
transitive closure TC S (as defined in Exercise 7). 


37. Show that a set « is an ordinal number iff it is a transitive set with 
the property that for any distinct x and y in g, either x € y or ye x. 


38. Show that whenever å is a limit ordinal, then 2 = (Ja. 


39. Prove that a set is an ordinal number iff it is a transitive set of 
transitive sets. 


CHAPTER 8 


ORDINALS AND ORDER TYPES 


This chapter ends with a discussion of the arithmetic of ordinal numbers. 
That topic is preceded by material that is useful for ordinal arithmetic, and 
that also has independent interest. 

The logical dependencies among the sections of this chapter are as follows. 
The section on alephs depends on the preceding section on transfinite 
recursion. The section on ordinal operations can be independent of the others, 
but it refers to the section on alephs for examples. The section on 
isomorphism types is independent of the earlier sections, and in turn forms 
the basis for the following section on the arithmetic of order types. 

The section on ordinal arithmetic refers to the preceding section for 
addition and multiplication, and to transfinite recursion for exponentiation. 
But it is indicated how all of the operations can be based on transfinite 
recursion, thereby avoiding order types. In either event, the section on 
ordinal operations is utilized here. 


TRANSFINITE RECURSION AGAIN 


In Chapter 7 we defined a set V, for each ordinal «. That definition 
proceeded in a somewhat roundabout manner, defining first a function 
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F, for each 6, and then proving that F,(x) was actually independent of ô 
(as long as 6 was greater than a). We could then define V, to be F,(a) 
for any large 6. 

It is interesting to note that the goal of the construction was not a 
function but a definition. We could not hope to have a function F with 
F(a) = V, for every a, because the domain of F would have to be the 
proper class of all ordinals. 

We now want a general transfinite recursion theorem that will yield 
directly the definition of V,, and will be applicable to other cases as well. 
In terms of proper classes, we can state this theorem as follows: 


Let G be a function-class whose domain is the class V of all sets. Then we 
claim that there is a function-class F whose domain is the class of all 
ordinals and such that 


for every «. 


Now we must apply the standard rewording. In place of G we have a 
formula y(x, y) that defines G. The assumption that G is a function-class 
with domain V is reworded to state that for any x there is a unique y such 
that y(x, y) holds. In place of F we also have a formula (x, y) defining F. 
For every ordinal « there must be a unique y such that (a, y) holds. 


The equation 
F(a) = G(F f a) 


is reworded to state that whenever f is a function with domain « such 
that (F, f (B)) holds for all $ € a (i.e., whenever f = F Ì æ) then the unique 
y such that g(a, y) equals the unique y such that »(f, y). 


Transfinite Recursion Schema on the Ordinals Let y(x, y) be any formula. 
Then we can construct another formula (u, v) such that the following i is 
a theorem. 


Assume that for every f there is a unique set y such that y(f, y). 
Then for every ordinal number « there is a unique y such that g(a, y). 
Furthermore whenever f is a function whose domain is an ordinal number 
a and such that o(f, f(f)) for all 8 less than «, we then have 


pla y) if X(f, y) 
for every y. 


We will give the proof presently. The significance of the theorem is 
that it justifies making a certain definition. We select some available symbol, 
say “t,” and define the term 


= the unique y such that (x, y) 


Transfinite Recursion Again all 


for every ordinal a. The operation of going from « to t, is y-constructed 
in the following sense. Whenever f is the restriction of the operation to 
some ordinal « (i.e, dom f = a and f(8) = t, for each £ less than «), then 
we can conclude from the above theorem that »(f, t,). We might abbreviate 
this by writing 

y(t |a, t,). 


Example We will apply the foregoing theorem to construct V,. For y 
use the formula we used before (in Lemma 7Q), so that y(f, y) is 


y= {Pz | zeran f}. 


Now plug this into the theorem and the subsequent discussion, but select 
the symbol “V” instead of “t.” We get a set V, for each a, but we must 
use the fact that the operation is y-constructed to be sure that we have 
the right V,. So consider any fixed « and let f. be the restriction of the 
operation to a: 


Ff (8) = 


for B less than a Then we can conclude that y(f, V,), which when 
expanded becomes 


V, = {Pz | zeran f} 
= UPV, |B e a} 


Comparing this equation with Theorem 7S, we see (by transfinite induction) 
that we have the same V, as in Chapter 7. 


Proof of the theorem The formula ọ(«, y) is the following: 


There exists an ordinal 6 greater than « and there exists a y-constructed 
function F, with domain ô such that F,(«) = y. 


As in Chapter 7, F, is said to be y-constructed iff y(F, 1 B, F,(8)) for 
every f eð. 

First we will show that these F, functions always exist. We are given 
that for every f there is a unique y such that y(f, y). Hence for any 
ordinal 6 we can apply transfinite recursion (from Chapter 7) to the well- 
ordered structure <ô, €,>. By doing so, we obtain the unique y-constructed 
function F, with domain ô. Now suppose we are given some ordinal « and 
we want to find some y such that g(«, y). We choose ô to be any larger 
ordinal (such as «*) and take y = F,(a). Then clearly g(a, y) holds. 

Next we will prove a uniqueness result. Namely, we claim that if F, 
and F, are y-constructed functions with domains 6 and e, respectively, then 
F(a) = F,(a) for all « in ê^ e. (This corresponds to Lemma 7R in the 
special case of V,.) The claim is proved by transfinite induction. For the 
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inductive step, suppose that F,(8) = F,(8) for all B e a. Then F, | a = F, } a. 
Since both y(F, fa, F,(«)) and y(F, l œ, F,(«)), we can conclude that 
F,(a) = F,(0). 

It now follows that for any ordinal « there is a unique y such that 
olx, y), and we know what this unique y is. Finally assume that f is a 
function whose domain is a and that (f, f(B)) holds for all Bea. We 
must show that 


olx y) <> »(f,y) 


for each y. But what is f? It must be the function F,, by the above. 
The only y for which y(F,, y) is F,(«), where 6 is some larger ordinal. 
And it is also the only y for which ¢(q, y). 4 


In the next section we will see further examples of the use of transfinite 
recursion on the ordinals. 


ALEPHS 


Suppose that we have defined some class A of ordinals. Now A might 
or might not be a set. If A is bounded, i.e., if there is an ordinal £ such 
that «ef for each « in A, then AS ft and consequently A is a set. 
On the other hand, if A is unbounded, then UA is the class of all ordinals. 
(This holds because any £, failing to bound A, is less than some « in A. 
Thus f € «€ A and Be | JA.) By the Burali-Forti theorem, A cannot be a 
set. Thus we can conclude that A is a set iff it is bounded. 

Now focus attention on the case where we have defined an unbounded 
(and hence proper) class of ordinals. As an example we will take the class of 
infinite cardinal numbers, but later we can apply our work to other classes. 
The class of infinite cardinal numbers is unbounded. (One way to prove 
this fact is to observe that for any ordinal B we obtain from Hartogs 
theorem the least ordinal « not dominated by £. Then « is an initial ordinal 
and is larger than $. An alternative proof uses 2°"44.) 

The class of infinite cardinal numbers, like any class of ordinals, 
inherits the well ordering by epsilon on the class of all ordinals, given 
by Theorem 7M. We want to enumerate its members in ascending order. 
There is a least infinite cardinal, which we have always called Nọ. Then 
there is a next one (the least infinite cardinal greater than %,), which we 
call X|. And so forth. 

We will now expand that “and so forth.” Suppose we have worked our 
way up to g, i.e., we have defined N, for all $ less than « and we are ready 
to try defining &, . (In the preceding paragraph a = 2.) Naturally we define 


N, = the least infinite cardinal different from N; for every $ less than a. 
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Such a cardinal must exist, because {N, | 8 € a} is merely a set, whereas the 
class of infinite cardinals is unbounded. 

Now that we know how to construct X, from the smaller alephs, we can 
apply transfinite recursion on the ordinals. Choose for y(f, y) the formula 


y is the least infinite cardinal not in ran f. 


Then for any set f, there is a unique y such that y(f, y), again because 
ran f is merely a set whereas the class of infinite cardinals is unbounded. 
Then transfinite recursion lets us pick a symbol (we pick “N”) and define 
N, for each « in a way that is y-constructed. That is, if f is the function 
with domain « defined by 


F(B)=8, for Bea, 
then y(f, X,). And y(f, &,), when written out, becomes 
N, is the least infinite cardinal not in ran f, 
which by the definition of f becomes 
N, is the least infinite cardinal different from &, for every £ less than «. 


The following theorem verifies that this construction enumerates the 
infinite cardinals in ascending order. 


Theorem 8A (a) Ifae f, then X, < Nz. 
(b) Every infinite cardinal is of the form X, for some «. 


Proof (a) This isa consequence of the fact that X, is the least cardinal 
meeting certain conditions that become more stringent as « increases. Both 
N, and X, meet the condition of being different from N, for all y less than 
a. Since X, was the least such candidate (and N, # N,), we have N, < N,. 

(b) We use transfinite induction on the class of infinite cardinal 
numbers (a subclass of the ordinals). Suppose, as the inductive hypothesis, 
that x is an infinite cardinal for which all smaller infinite cardinals are in 
the range of the aleph operation. Consider the corresponding set {£ | X p<} 
of ordinals. This is a set (and not a proper class), being no larger than x. 
And it is a transitive set by part (a). So it is itself an ordinal; call it a. 
By construction, X, is the least infinite cardinal different from N, whenever 
N, < x. By the inductive hypothesis, this is the least infinite cardinal different 
from all those less than x, which is just x itself. 4 


The construction up to here is applicable to any unbounded class A of 
ordinals that we might have defined. We take for y(f, y) the formula 


y is the least member of A not in ran f. 
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Transfinite recursion then lets us define t, for every « in such a way that 
t, is the least member of A different from t; for every £ less than a. 


The analogue of Theorem 8A holds (and with the same proof), showing 
that we have constructed an enumeration of A in ascending order. 

Now that we have a legal definition of &,, we can try to see what 
that definition can give us. There are three possibilities for «: zero, a 
successor ordinal, or a limit ordinal. We already know about No, so 
consider the case where « is a successor ordinal, say a= ft. By the 
construction and Theorem 8A, Noe is the least infinite cardinal greater than 
N, for every y less than B*. We can simplify this since the largest 
candidate for N, is N- Thus 


N+ = the least cardinal greater than Ng 


By Exercise 23 of Chapter 7, the least cardinal greater than Ng as a set, 
is the set of all ordinals dominated by the set Ng- 

Now consider the case of limit ordinal 4. As before, N , is the least 
infinite cardinal greater than N, for every B less than 4. We claim that 


in fact 
N= U Ne 
Bea 
From Chapter 7 we know that |) peas, is the least ordinal greater than 
N, for every $ in A. The following lemma shows that it is actually a 
cardinal number. 


Lemma 8B The union of any set of cardinal numbers is itself a cardinal 
number. 


Proof Let A be any set of cardinals. Then (JA is an ordinal by 
Corollary 7N(d). We must show that it is an initial ordinal. So assume 
that «e| JA; we must show that « # (JA. We know that «exe A for 
some cardinal number x. Thus a € x S |) A, so that if « x (JA, then « © x. 
But it is impossible to have « % x, since x is an initial ordinal. Hence 


aX JA. J 


As another application of transfinite recursion on the ordinals, 
we can define the beth numbers 3, . (The letter 3, called beth, is the second 
letter of the Hebrew alphabet.) The following three equations describe the 
term J, that we want to define: 


Jo =o, 
2- = 2734 
3, = U 2, 


aed 
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where 4 is a limit ordinal. The transfinite recursion machinery lets us convert 
these three equations into a legal definition. 
To operate the machinery, let y(f, y) be the formula: 


Either (i) f is a function with domain 0 and y = XN}, 


or (ii) f is a function whose domain is a successor ordinal «* 
and y = 2/@), 

or (iii) f is a function whose domain is a limit ordinal A, and 
y= (ran f), 

or (iv) none of the above and y = Ø. 


Then transfinite recursion gives us a formula g with the usual properties. 
We select the symbol J and define 


, = the unique y such that ¢(«, y). 


Then the machinery tells us that “y(3 | «, 3,).” Writing out this condition 
with « = 0 produces the equation 


3, =p. 


Similarly, by using a successor ordinal and then a limit ordinal we get the 
other two equations 
a, + = 23, 
2, = U 2- 
aei 
The continuum hypothesis (mentioned in Chapter 6) can now be stated 
by the equation N, = 3,, and the generalized continuum hypothesis is the 
assertion that NX, = 2, for every «. (We are merely stating these hypotheses 
as objects for consideration; we are not claiming that they are true.) 


Exercises 


1. Show how to define a term t, (for each ordinal a) so that tọ = 5, 
t,+ = (t,)*, and t, = (ae 2t, for a limit ordinal A. 

2. In the preceding exercise, show that if «e œ, then t, =5+a. Show 
that if w e a, then t, = a. 


ORDINAL OPERATIONS 


There are several operations on the class of ordinal numbers that are 
of interest to us. A few of these operations have already been defined: the 
successor operation assigning to each ordinal f its successor B*, and the 
aleph and beth operations assigning to f the numbers N, and 3,. More 
examples will be encountered when we study ordinal arithmetic. 
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These operations do not correspond to functions (which are sets of 
ordered pairs), because the class of all ordinals fails to be a set. Instead 
for each ordinal £ we define an ordinal tẹ as the unique ordinal meeting 
certain specified conditions. We will say that the operation is monotone iff 
the condition 


aep = tet 


a 


B 
always holds. We will say that the operation is continuous iff the equation 


= U ty 


holds for every limit ordinal 2. Finally we will say that the operation is 
normal iff it is both monotone and continuous. 


Example If «e B, then «* € B* (by Exercise 16 of Chapter 7). Thus the 
successor operation on the ordinals is monotone. But it lacks continuity, 
because o* # |), eont =o. 


neo 


Example The aleph operation is normal by Theorem 8A (for mono- 
tonicity) and Lemma 8B with the accompanying discussion (for continuity). 
More generally, whenever the t,’s enumerate an unbounded class A of 
ordinals in ascending order, then monotonicity holds (by the analogue of 
Theorem 8A). Pursuing the analogy, we can assert that continuity holds if 
A is closed in the sense that the union of any nonempty set of members of 
A is itself a member of A. Conversely whenever we have defined some 
normal ordinal operation t,, then its range-class 


{t,| is an ordinal number} 
is a closed unbounded class of ordinal numbers (Exercise 6). 


Recall (from the discussion following Corollary 7N) that whenever S is 
a set of ordinal numbers, then | }S is an ordinal that is the least upper 
bound of S. It is therefore natural to define the supremum of S (sup S) 
to be simply | JS, with the understanding that we will use this notation 
only when S is a set of ordinal numbers. For example, 


w = sup{0, 2, 4, ...}. 


The condition for an operation taking f to tẹ to be continuous can be 
stated 


t= sup{t, | BEA 
for limit ordinals 4. 


The following result shows that for a continuous operation, the condition 
for monotonicity can be replaced by a more “local” version. 
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Theorem Schema 8C Assume that we have defined a continuous 
operation assigning an ordinal tẹ to each ordinal number $. Then the 
operation is monotone provided that t, € t,+ for every ordinal y. 


Proof We consider a fixed ordinal « and prove by transfinite induction 
on f that 


aep => t,€b,. 


Case I ß is zero. Then the above condition is vacuously true, since 


a ép. 
Case II B is a successor ordinal yt. Then 


«xep = aey 


=> tet by the inductive hypothesis 
a &b, y yp 

=> b, Et since t, E€t,+ 

= %,¢€ ta ` 


Case III Bisa limit ordinal. Then 
aep > acateß 
=> £,€0,+€E¢ B 
by continuity. Hence t, € f,. 4 


Example The beth operation (assigning 2; to f) is normal. The continuity 
is obvious from its definition. And since 4, is always less than 2%, we have 
(by the above theorem) monotonicity. 


The next results will have useful consequences for the arithmetic of 
ordinal numbers. 


Theorem Schema 8D Assume that we have defined a normal operation 
assigning an ordinal t, to each ordinal «. Then for any given ordinal f that 
is at least as large as tọ, there exists a greatest ordinal y such that 


t, € B. 


We know that any nonempty set of ordinals has a least member, 
but the above theorem asserts that the set {y | t, € B} has a greatest member. 


Proof Consider the set {a |t, e B}. This is a set of ordinals and it is 
(by monotonicity) a transitive set. So it is itself an ordinal. It is not 0, 
because to € $. Could it be a limit ordinal 4? If so, then 


t, = sup{t, |x € 4} €B, 
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whence 4 eå. This is impossible, so our set must in fact be a successor 
ordinal y*. Thus y is the largest member of the set, and so is the largest 
ordinal for which t, € £. 4 


Theorem Schema 8E Assume that we have defined a normal operation 
assigning t, to each ordinal number «. Let S be a nonempty set of ordinal 
numbers. Then 

loops = sup{t, | aE S}. 

Proof We get the “>” half by monotonicity: 


«eS = aesupS 
= ta E Loup Ss? 
whence sup{t, | æ € S} € tups 
For the other inequality, there are two cases. If S has a largest member 
ô, then sup S = ô and so tups = t; E sup{t,| « € S}. 
If S has no largest member, then sup S must be a limit ordinal 
(since S # Ø). So by continuity, 


bups = sup{t, | Be sup S}. 
If Be sup S, then Be ye S for some y. Consequently tget,, and so 
tg sup{t, | æ € S}. 
Since £ was an arbitrary member of sup S, 


sup{t,| 8 € sup S}  sup{t,| « € S} 
as desired. +4 


For a monotone operation, we always have B € tz, by Exercise 5. Is it 
possible to have f = t,? Yes; the identity operation (assigning £ to B) is 
normal. But even for normal operations with rapid growth (such as the 
beth operation), there are “fixed points” B with $ = tz. The following 
theorem, although not essential to our later work, shows that in fact such 
fixed points form a proper class. 


Veblen Fixed-Point Theorem Schema (1907) Assume that we have 
defined a normal operation assigning t, to each ordinal number a. Then 
the operation has arbitrarily large fixed points, i.e. for every ordinal 
number $ we can find an ordinal number y with t, =yand Bey. 


Proof From the monotonicity we have Be t,. If B = tz, we are done— 
just take y = $. So we may assume that $ € tg. Then by monotonicity 


Betet,e°-. 


We claim that the supremum of these ordinals is the desired fixed point. 
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More formally, we define by recursion a function f from œ into the 
ordinals such that 
f(0) = 8B and f(0*) = ty 
Thus f(1) =t,, f(2)=t,,, and so forth. We will write tẹ for f(n). As 
mentioned above, by monotonicity t% € t`, so that 
Betsethe-. 


Define 4 = sup ran f = sup{t; | n € œ}. We claim that ¢, = 4. 
Clearly 4 has no largest element, and hence is a limit ordinal. Let 
S = ran f = {tj | n € œ}, so that A = sup S. By Theorem 8E, 
t, = sup{t,| a € S} 
= sup{t;* | new} 
=). 
Thus we have a fixed point. 4 
This proof actually gives us the least fixed point that is at least as 


large as $ (by Exercise 7). For example, consider the aleph operation 
with 8 = 0. Then 


DEN oe Ry E] 


and the supremum of these numbers is the feast fixed point. 

Since the class of fixed points of the operation t is an unbounded class 
of ordinals, we can define the derived operation t enumerating the fixed 
points in ascending order. The definition of t’ is produced by the transfinite 
recursion machinery; the crucial equation is 


t, = the least fixed point of ¢ different from t; for every £ € a. 


It turns out (Exercise 8) that r’ is again a normal operation. 


Exercises 


3. Assume that we have defined a monotone operation assigning t, to each 
ordinal x. Show that 


and 


for any ordinals $ and y. 


4. Assume that we have defined a normal operation assigning t, to each 
a, Show that whenever J is a limit ordinal, then t, is also a limit ordinal. 
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5. Assume that we have defined a monotone operation assigning t, to each 
a. Show that f € t, for every ordinal number $. 


6. Assume that we have defined a normal operation assigning t, to each 
a. Show that the range-class 


{t | x is an ordinal number} 


is a closed unbounded class of ordinal numbers. 


7. Show that the proof of Veblen’s fixed-point theorem produces the least 
fixed point that is at least as large as £. 


8. Assume that we have defined a normal operation assigning t, to each 
a. Let t’ be the derived operation, enumerating the fixed points of t in 
order. Show that t’ is again a normal operation. 


ISOMORPHISM TYPES 


We have ordinal numbers to measure the length of well orderings. If 
you know the ordinal number of a well-ordered structure, then you know 
everything there is to know about that structure, “to within isomorphism.” 
This is an informal way of saying that two well-ordered structures receive 
the same ordinal iff they are isomorphic (Theorem 71). 

Our next undertaking will be to extend these ideas to handle structures 
that are not well ordered. Although our intent is to apply the extended 
ideas to linearly ordered structures, the initial definitions will be quite 
general. 

Consider then a structure <A, R». The “shape” of this structure is to 
be measured to within isomorphism by it¢A, RY, the isomorphism type of 
<A, R). This is to be defined in such a way that two structures receive the 
same isomorphism type iff they are isomorphic: 


it(A, R) = it<B, S> iff <A, RY = <B, S). 


As a first guess, we could argue as follows: Isomorphism is an equivalence 
concept, so why not take it¢A, R> to be the equivalence class of <A, R>? 
Then as in Lemma 3N, two structures should have the same equivalence 
class iff they are isomorphic. 

In one way, this first guess is a very bad idea. The “equivalence 
classes” here will fail to be sets. If A is nonempty, then no set contains 
every structure that is isomorphic to (A, R); this is Exercise 9. 

With but one modification, this first guess will serve our needs perfectly. 
Do not take all structures isomorphic to <A, Ry, just those of least possible 
rank æ. Then these structures will be in V,+, and so it<A, R> will be a set. 
We now will write this down officially. 
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Definition Let R be a binary relation on A. The isomorphism type 
it(A, R) of the structure <A, R) is the set of all structures <B, Sẹ such 
that 


(i) <A, R> = <B, S>, and 
(ii) no structure of rank less than rank<B, SY is isomorphic to <B, S). 


It will help to bring in (but only for very temporary use) some 
terminology. Call a structure pioneering iff there is no structure of smaller 
rank isomorphic to it. If two pioneering structures are isomorphic, then 
clearly they have the same rank. 

We claim that any structure <A, R» is isomorphic to some (not necessarily 
unique) pioneering structure. The class 


{a | æ is the rank of a structure isomorphic to <A, RY} 


is nonempty (rank<A, R> is in it). So it has a least element «}, which 
we can call the pioneer ordinal for <A, RY. Some structure of rank a, is 
isomorphic to <A, R}, and it is pioneering by the leastness of a. 

The definition of it¢A, Rẹ can now be phrased: It is the set of all 
pioneering structures that are isomorphic to <A, RY. This set has been 
seen to be nonempty. 


Note that it¢A, RY is indeed a set, being a subset of V,+, where 
a is the pioneer ordinal for <4, RY. 


Theorem 8F Structures <A, R» and <B, S> have the same isomorphism 
type iff they are isomorphic: 


it<A, RD = it<B, S iff <A, R) = <B, S). 


Proof First assume that it¢A, R> = it<B, S>. Let <C, TY be any 
structure in this common isomorphism type. Then 


<A, RD = <C, T) = <B, S). 


For the converse, assume that <A, RY = <B, SY. Then the same structures 
are isomorphic to each of these two; in particular, the same pioneering 
structures are isomorphic to each. Hence it<4, Rẹ = it<B, S). 4 


Digression Concerning Cardinal Numbers The crucial property of 
cardinal numbers, 


card A = card B iff AB, 


looks a lot like Theorem 8F, simplified by omission of the binary relations. 
This indicates the possibility of an alternative definition of cardinal numbers. 
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Specifically, define kard A to be the set of all sets B equinumerous to 
A and having the least possible rank. Then (as in Theorem 8F), 


kard A = kard B iff AB. 


In comparing card A with kard A, we see that the definition of card A 
uses the axiom of choice (but not regularity). The definition of kard A relies 
on regularity but does not require the axiom of choice. (This is a point 
in favor of kard.) For a finite nonempty set A, kard A fails to be a natural 
number. (This is a point in favor of card.) 


Exercises 


9. Assume that <A, RY is a structure with A # Ø. Show that no set 
contains every structure isomorphic to <A, R). 
10. (a) Show that kard n = {n} for n = 0, 1, and 2. 
(b) Calculate kard 3. 


ARITHMETIC OF ORDER TYPES 
We now focus attention solely on the order types. 


Definition An order type is the isomorphism type of some linearly 
ordered structure. 


We will use Greek letters p, o,. . . for order types. Any member of an 
order type p is said to be a linearly ordered structure of type p. 

First we want to define addition of order types. The basic idea is that 
p +o should be the order type “first p and then øo.” For the real definition 
of p +o, first select <A, Rẹ of type p and <B, Sẹ of type o with 
Ac B= Ø. (This is possible, by Exercise 11.) Then define the relation 
R@®S by 


R@®S=RvSv (Ax B). 
(We reject the cumbersome R ,®,5 notation.) Finally we define p + o by 
pta=it(AUB,R@S). 


The idea is that we want to order A U B in such a way that any member 
of A is less than any member of B. But within A and B, we order 
according to R and S. 

The next lemma verifies that p + o is a well-defined order type. 


Lemma 8G Assume that <A, R> and <B, S> are linearly ordered 
structures with A and B disjoint. 
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(a) R@S isa linear ordering on A vu B. 
(b) If <A, R> & <A’, R’), <B, SD = <B’, SD, and A’ ^ B' = Ø, then 


<AU B,R@S> = <A UB, R'@S’. 


Proof (a) First of all, R @S is irreflexive because a pair of the form 
<x, x> cannot belong to R or S, nor can it belong to A x B since A and B 
are disjoint. 

To check that R@®S is transitive, suppose that <x, y> and <y, z> 
belong to R @S. If both of these pairs come from R (or from S), then of 
course xRz (or xSz). Otherwise one of the pairs comes from A x B. We 
may suppose that <x, yò e A x B; the other case is similar. Then x e A and 
the pair <y, z> comes from S (because y e B). Hence z € Band <x, z € A x B. 

Finally, it is easy to see that R ®S is connected on A o B. Consider 
any x and y in Aw B. There are three cases: both in A, both in B, or 
mixed. But all three are trivial. 

Part (b) is sufficiently straightforward so that nothing more need be 
said. 4 


The role of this lemma is to assure us that when we define 
pto=it(AUB,RO@S), 


the end result will be independent of just which structures of type p and o 
are utilized. 


Example For each ordinal a, we have the order type it<«, €,>. Distinct 
ordinals yield distinct order types, since 


it<a, E> = it<B, Ep) = <a, E, = <B, E,> 
=> a=8 


for ordinals, by Theorems 7I and 7L. The order type it<«, €,> will be 
denoted as & In particular, we have the order types I, 3, and @ (which are 
it<1, €,> and so forth). And 1+3=4. (Please explain why.) Also 
{+ © = ð, whereas © +1=a". 


Example It is traditional to use y and å to denote the order type of the 
rationals and reals, respectively (in their usual ordering): 


ņn=it<Q, <) and A=it(R, <p) 


Then i + y is an order type with a least element but no greatest element. 
(Or more pedantically, the ordered structures of type I +n have least 
elements but not greatest elements.) But 7 + 1 has a greatest element but 
no least element; hence 1+44n+1. Also A+A#A, because in type 
A + A there is a bounded nonempty set without a least upper bound. On 
the other hand 4 + y = 4. This is not obvious, but see Exercise 19. 
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Any order type p can be run backwards to yield an order type p*. 
More specifically, we can select a linearly ordered structure <A, R» of type 
p. Then we define 

p* =it¢A, R7'). 


It is routine to check that p* is a well-defined order type (compare 
Exercise 43 of Chapter 3). For any finite ordinal «, we have @* = & by 
Exercise 19 of Chapter 7. But @* 4 @; in fact @* is the order type of the 
negative integers, which are not well ordered. It is easy to see that 4* = n 
and A* = 4. 


Example The sum @* + @ is the order type of the set Z of integers. 
But © + o% is different; it has both a least element and a greatest element, 
and infinitely many points between the two. 


Now consider the multiplication of order types. The product p : o can 
be described informally as “p, a times.” More formally, we select structures 
<A, R> and <B, S> of types p and ø, respectively. Then define R * S to be 
“Hebrew lexicographic order” on A x B: 


<a,,b,>(R *S)<a,,6,> iff either b,Sb, or (b, = b, and a, Ra,). 


This orders the pairs in A x B according to their second coordinates, 
and then by their first coordinates when the second coordinates coincide. 
We can now define the product 


p'a=it(A x B, R * S). 


The next lemma verifies that p -ø is a well-defined order type. This 
lemma does for multiplication what Lemma 8G does for addition. 


Lemma 8H Assume that <A, Rẹ and <B, Sò are linearly ordered 
structures. 


(a) R 8S isa linear ordering on A x B. 
(b) If <A, R> = <A’, RD and <B, S> = <B’, S, then 


<A x B,R*S) & CA’ x BY, R «SY. 


Proof (a) It is easy to see from the definition of R » S that it is 
irreflexive (because both R and S are) and is connected on A x B. It 
remains to verify that it is a transitive relation. So assume that 


<a, b,>(R « S)Xa,, b> and <a,,b,>(R * S)Xa,, b3). 


This assumption breaks down into four cases, as illustrated in Fig. 47. But 
in each case, we have <a,, b,>(R * S)<a;, b>. 

Part (b) is again sufficiently straightforward that nothing more need be 
said. 4 
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<a,,b,>(R * S)Xa,, b> 


b, = b, & a, Ra, 


<a,,b,>(R * SKa, b,) 
b, = b,& a,Ra, 


Fig. 47. Transitivity of R x S. 


Examples The product @:2 is “a, 2 times.” The set œ x2 under 
Hebrew lexicographic ordering is: 


<0,0> <1,0>, ¢2,0>,...,<0,1>, <1,1>, <2, 1),.... 


On the other hand 2-@ is “2, @ times.” The set 2 x w under Hebrew 
lexicographic order is: 


<0,0> <1,0>, <0,1>, <1,1>, <0,2>, <1,2),.... 


This ordering, unlike the other, is isomorphic to the natural numbers, 


ie, 2-@=@. In particular, 2-04 @- 2. 


The next theorem gives some of the general laws that addition and 
multiplication obey. We have already seen that neither operation obeys the 
commutative law in general; for example, 


i+@#4@+1 and 2:046:-2. 
Furthermore the right distributive law fails; for example, 
(© +1):24@-2+1-2 
(Exercise 15). 
Theorem 8I The following identities hold for any order types. 
(a) Associative laws 
(ep +o)+t=p+(o+7), 
(p:a) t=p: (0: t). 
(b) Left distributive law 


p(o +t)= (p0) + (p: 1). 
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(c) Identity elements 


p+0=0+4+p=p, 
p'l=1-p=p, 
p-0=0-p=0. 


Proof Both (p +o)+ 1 and p + (o + t) produce the “first p then c 
and then t” ordering. Let <A, RY, <B, S>, and <C, T> be of type p, a, 
and qt, respectively, with A, B, and C disjoint. Then both (R@®S)@®T 
and R @ (S @ T) are easily shown to be 


RUSUTVAXBUAXCUBXC. 


Both (p-o)-t and p-(o-t) produce Hebrew lexicographic order on 
A x B x C. (R *S)* T is an ordering on (A x B) x C, and R * (S x T) is 
an ordering on A x (B x C), but we can establish an isomorphism. In 
detail, the condition for <a}, b}, c,> to be less than <a,, ba, c,> under 
(R * S)» T is, when expanded, 


cTe, or (cy=c,&b,Sb,) or (c, =c, & b, =b, & a,Ra,). 


The condition for <a,, <b,, c,>> to be less than <a,, <b,, c,>> under 
R « (S * T) is, when expanded, exactly the same. 

In the left distributive law, both R » (S ® T) and (R + S) @(R * T) are 
orderings on the set (A x B) U (A x C). And in both cases the condition 
for <a,, x> to be less than <a,, yò turns out to be 


xSy or xTy or (xeB& yeC) or (x=y&a,Ra,). 


Part (c) is straightforward. 4 


Exercises 

11. Show that for any order types p and ø there exist structures (A, RY 
and <B, S> of types p and ø, respectively, such that A ^ B= Ø. 

12. Prove that for any linearly ordered structures, 


it¢A, R> + it<B, S> = it<({0 x A) o ({1} x B), <;>, 


where <, is lexicographic ordering. 

13. Supply proofs for part (b) of Lemma 8G and part (b) of Lemma 8H. 
14. Assume that p-o = 0. Show that either p = 0 or c = 0. 

15. Show that (© + 1)-2 is not the same as (@- 2) + (I: 2). 

16. Supply a proof for part (c) of Theorem 8I. 
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17. A partial ordering R is said to be dense iff whenever xRz, then 
xRy and yRz for some y. For example, the usual ordering <g on the 
set Q of rational numbers is dense. Assume that <A, Ry is a linearly 
ordered structure with A countable and R dense. Show that <A, RY is 
isomorphic to <B, <g> for some subset B of Q. [Suggestion: Suppose 
that A = {ay, a,, ...}. Define f(a,) by recursion on i.] 


18. Assume that <A, R> and <B, S> are both linearly ordered structures 
with dense orderings. Assume that A and B are countable and nonempty. 
Assume that neither ordering has a first or last element. Show that 
<A, R> = <B, S>. [Suppose that A = {ay, a,, ...} and B= {b}, b,, ...}. At 
stage 2n be sure that a, is paired with some suitable b; and at stage 
2n + 1 be sure that b, is paired with some suitable a,.] 


19. Use the preceding exercise to show that n +y =y: =n. 


ORDINAL ARITHMETIC 


There are two available methods for defining addition and multiplication 
on the ordinals. One method uses transfinite recursion on the ordinals, 
extending the definitions by recursion used for the finite ordinals in Chapter 4. 
The second method uses our more recent work with order types, and defines 
a + B to be the ordinal y such that g + B= 7 (under addition of order 
types). 

Since both methods have their advantages, we will show that they 
produce exactly the same operations. We will then be able to draw on either 
method as the occasion demands. 

We will say that an order type p is well ordered iff the structures of type 
p are well ordered. 


Theorem 8J The sum and the product of well-ordered types is again 
well ordered. 


Proof Assume that <A, R> and <B, Sò are well-ordered structures with 
‘A and B disjoint. What is to be proved is that <A u B, R@S) and 
<A x B, R * Sò are well-ordered structures. If C is a nonempty subset of 
A U B, then either C ^ A# Ø (in which case its R-least element is 
(R @S)-least in C) or else C S B (in which case its S-least element is 
(R @ S)-least. Similarly if D is a nonempty subset of A x B, we first take the 
S-least by in ran D. Let ay be the R-least member of {a | <a, by> € D}. Then 
Cao, bo) is the (R * S)-least element of D. 4 


The above theorem lets us transfer addition and multiplication from well- 
ordered types directly to the ordinals. For ordinals « and £, we have the well- 


228 8. Ordinals and Order Types 


ordered types g + B and @ - B. The ordinals of these types are defined to be 
a + pand a: f. 


Definition Let « and f be ordinal numbers. Define the sum « + £ to be 
the unique ordinal y such that & + B = 7. Define the product « - B to be the 
unique ordinal ô such that g: B = 6. 


‘To be more specific, observe that g + f is the order type of 
({O} x a) o ({1} x B) 


in lexicographic order <, (compare Exercise 12). Consequently « + £ is the 
ordinal number of the structure 


<({0} x a) o 1 x 8), <p>. 
This is the ordinal number that measures the “first « and then $” ordering. 
Similarly &- B is the order type of « x B with Hebrew lexicographic order 
<y- Consequently « > $ is the ordinal number of the structure <a x p, <4). 
This is the ordinal number that measures the “a, 8 times” ordering. 
We use the same symbols + and - for both order type arithmetic and 


ordinal arithmetic, but we will try always to be clear about which is 
intended. 


Example To calculate the sum 1 + œ, we shift to the order types 1 + ©. 
From a previous example I + @ = @, so going back to ordinals we have 
1 + œ = œ. Similarly from the known equations © + I = w* and2:-@=4, 
we obtain the corresponding equations for ordinals: œ + 1 = œw* and 
2:o=a. 


Example We claim that « + 1 = &* for any ordinal a. The sum & + I is 
the order type of a U {s} (where s ¢ «) under the ordering that makes s the 
largest element. This is the same as the order type of « U {a} under the 
epsilon ordering, which is just «*. So we have g+ I =x", whereupon 
a+tl=at. 


Note that the definition of ordinal addition and multiplication yields 
the equations 


a+B=a+p and a:-p=a-B. 


To verify an equation, e.g., œ : 2 = œ + œ, we can use the following strategy. 
If suffices to show that the order types w` 2 and @ + @ are the same, since 
the assignment of order types to ordinals is one-to-one. By the above 
equations, this reduces to verifying that @ - 2 = @ + ©. And this can be done 
by selecting representative structures for each side of the equation and 
showing them to be isomorphic. In recent notation, the fact that 


<œ x 2, <y> = <({0} x œw) o ({1} x œ), <> 
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does the job. (Alternatively, we can appeal to the next theorem to obtain 
w:2=0 +.) 

The laws previously established for all order types must in particular be 
valid for ordinals: 

Theorem 8K_ For any ordinal numbers: 


(2+ p)+y=a+ (B+), 
(a: B)-y=a-(B-y), 

a: (B+ y) = (a: B) + (x-7), 
a+0=0+a=4, 
a-l=l-e=a, 
a°0=0-a=0. 

Proof We know from Theorem 8I that addition of order types is 
associative, so 


(2+ B)+F7=a+ (P +7) 
Hence the ordinals (a + $) + y and « + (8 + y) have the same order type, 


and thus are equal. The same argument is applicable to the other parts 
of the theorem. 4 


Our work with order types also provides us with counterexamples to the 
commutative laws and the right distributive laws: 
l+o#oa+1, 
2-@#4o°2, 
(w+ 1)-2 A(m-2) + (1-2). 
The next theorem gives the equations that characterize addition and 


multiplication by recursion. Recall that for a set S of ordinals, its supremum 
sup S is just (JS. 


Theorem 8L For any ordinal numbers « and f and any limit ordinal 4 
.the following equations hold. 


(A1) a+0=a, 

(A2) a+ BY = (a +B)", 

(A3) a + À= supa + B| Be A}, 
(M1) a-0=0, 

(M2) a BT =a-Bta, 

(M3) a-A=sup{a-B| Be A}. 
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Proof (Al) and (M1) are contained in the previous theorem. To prove 
(A2), recall that B* = B + 1. Hence 


a+ p*=a+ (6 +1) 


=(«+ B)+1 
= (x + B)*. 
Similarly for (M2) we have 
a: Bt =a-(B +1) 
= (a $) + («> 1) 
= (x: p) + a. 


It remains to prove (A3) and (M3). For that proof we need a lemma on 
chains of well orderings. Say that a structure <B, < „> is an end extension 
of a structure <A, < ,> iff AS B, < S < p, and every element of B — A 
is larger than anything in A: 


acA&beB-A = a<,b. 


Lemma 8M Assume that @ is a set of well-ordered structures. Further 
assume that if <A, < ,> and <B, < p are structures in @, then one is an 
end extension of the other. Let <W, < y) be the union of all these structures 
in the sense that 


W =\f{A| <A, < ,> € @ for some < ,}, 
w= Ul<,| <4 <> © @ for some 4}. 


Then <W, <) is a well-ordered structure whose ordinal number is the 
supremum of the ordinals of the members of &. 


Proof of the Lemma For each structure <A, < ,> in @, we have the usual 
isomorphism E , onto its ordinal number, ordered by epsilon. If <B, < TA 
is an end extension of <A, <,>, then E, is an extension of z a> 
E, = E; l A. (This is easy to see, but formally we verify E ,(a) = Eva j by 
induction on a in A.) Thus the set of all possible E p's 


{E,|<A, <,>€@ for some < } 


is a chain of one-to-one functions. So its union (call it E) is a one-to-one 
function. The domain of E is the union of all possible A’s; that is, 
dom E = W. The range of E is the union (supremum) of the ordinals of the 
structures in @; call this ordinal 0. Thus E maps W one-to-one onto 8. 
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Furthermore it preserves order: 
X<py S x<yy for some A in @ 
= E,x)eEfy) forsome Ain 8 
<> E(x) E(y). 


Thus E is an isomorphism of <W, < > onto <0, €>. Hence <W, <) is 
well ordered, and its ordinal is @. 4 


Now return to the proof of Theorem 8L. Since / is a limit ordinal, we 
have A = | JA. The sum « + å is the ordinal number of the following set 
(under lexicographic order): 


({0} x a) o ({1} x A) = ({0} x a) o ({1} x Uf{B | Be A}) 
= ({0} x a) o {tl} x p] Be A} 
= U{((O} x a) o {1} x B)| Bed} 

= {Ay | Be 4}, 


where A, = ({0} x a) o ({1} x B). The ordinal of A, (under lexicographic 
order) is a + B. The ordinal of | ){A, | £ € 4} is provided by the lemma. Once 
we verify that the lemma is applicable, it will tell us that the ordinal of 
fA, | B € A} is sup{x + B | B € å}, which is what we need for (A3). 

For any $ and y in A, either Bey or ye $. If it is the former, then A, 
(under lexicographic order) is clearly an end extension of A g- Thus the lemma 
is applicable. 

The proof of (M3) is similar. The product «+ 4 is the ordinal number of 
a x A under <y, Hebrew lexicographic order: 


axdA=ax |f{p| ped} 
= Ufa x B| Bea} 


and the ordinal of « x $ is «- B. Again the lemma is applicable, because 
whenever $ € y, then « x y is an end extension of « x B (under <,,). The 
lemma tells us that the ordinal of |){« x B|BeA} is sup{a- B j B EA, 
_which completes the proof of (M3). 


For finite ordinals (the natural numbers) we have now defined addition 
and multiplication three times: first in Chapter 4 (by recursion), then in 
Chapter 6 (as finite cardinals), and now again (by use of order types). All 
three agree on the natural numbers; Theorem 8L shows that the recent 
definition is in agreement with Chapter 4. 

Theorem 8L also indicates how addition and multiplication could have 
been (equivalently) defined by transfinite recursion on the ordinals. The 
six equations in the theorem describe how, for fixed a, to form «+ 8 
from earlier values « + y for y less than £. 
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We will give the details of this approach of exponentiation, but they are 
easily modified to cover addition and multiplication. Consider then, a fixed 
nonzero ordinal a. We propose to define « in such a way as to satisfy 
the following three equations: 


(E1) a° = 1, 
(E2) aft = gh - gy, 
(E3) a? = sup{a? | B eA} 


for a limit ordinal A. 
Now we activate the transfinite recursion machinery. The procedure is 
similar to that used for the beth operation. Let y(f, y) be the formula: 


Either (i) fis a function with domain 0 and y = 1, 


or (ii) fis a function whose domain is a successor ordinal f* 
and y = f ($) ` a, 

or (iii) fis a function whose domain is a limit ordinal å and 
y=Uranf, 

or (iv) none of the above and y = Ø. 


Then transfinite recursion replies with a formula @ that will define the 
exponentiation operation. We select the symbol “,E” and define 


aE, = the unique y such that (£, y) 
for every ordinal 8. Then we are assured that “y(,E | B, ,E,).” For B = 0, 
this fact becomes 
Eo = 1. 
For a successor ordinal B* in place of $ it becomes 
aEg+ = «Eg “a 
and for a limit ordinal å in place of $ it becomes 
ak, = supí, E, | Y E 4}. 


These three displayed equations are (E1)-(E3), except for notational 

differences. We henceforth dispense with any special symbol for exponenti- 
ation, and instead use the traditional placement of letters: 

af = Eg. 

There is a special problem in defining 0°. If we were to follow blindly 

(E1)-(E3), we would have 0° = 1. This is undesirable, which is our reason 

for having specified in the foregoing that « is a fixed nonzero ordinal. We 
can simply define 0° directly: 0° = 1 and 0° = 0 for B # 0. 


Ordinal Arithmetic 233 


Example We have 2° = sup{2” | n € œ} = œw. Thus ordinal exponentiation 
is very different from cardinal exponentiation, since 2° 4 x for cardinals. 
Ordinal addition and multiplication are also very different from cardinal 
addition and multiplication. Please do not confuse them. Theorem 6J tells 
us that the operations agree on finite numbers; that theorem does not 
extend to infinite numbers. 


We can now apply our remarks on ordinal operations to derive informa- 
tion concerning ordinal arithmetic. Consider a fixed ordinal number «. Then 
the operation of a-addition is the operation assigning to each ordinal $ the 
sum « + £. (In our earlier notation, tz = & + B, where « is fixed.) Similarly, 
the operation of «-multiplication assigns to B the product a: f, and a- 
exponentiation assigns to f the power g’. 


Theorem 8N (a) For any ordinal number a, the operation of a- 
addition is normal. 

(b) If 1 a, then the operation of -multiplication is normal. 

(c) If2eķx, then the operation of -exponentiation is normal. 


Proof Continuity is immediate from (A3) and (M3) of Theorem 8L and 
from (E3). For monotonicity, we use Theorem 8C, which tells us that it 
suffices to show that: 


at+Ppeat pt, 
a Bea: pt if lea, 
a? e aft if 2ea 


The first of these is immediate from (A2) of Theorem 8L: 
atPe(at+p)* =a+ Br, 


whence -addition is normal. 
For multiplication we have 


lea => Oca 
and so by the monotonicity of (« - B)-addition, 
i a Bea- Bra. 


This together with (M2) gives a: Bea: Bt. 
Exponentiation is similar; 


2ea => lea, 
whence by the monotonicity of «’-multiplication and (E3) 
xf eaf -a = aht, 


This completes the proof. 4 
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Example Assume that A is a limit ordinal. By Exercise 4, t, (for a 
normal operation) is also a limit ordinal. So by the preceding theorem, 
a +A is a limit ordinal for any ordinal «. If lea, then both «:2 and 
4-« are limit ordinals. (In the second case we use the fact that 
A- B% =å; B + å.) Similarly if 2 € q, then both « and J? are limit ordinals. 


Corollary 8P (a) The following order-preserving laws hold. 
Bey =~ a+fBPeaty. 
If 1 e «, then 
Bey = a: Bea-y. 
If 2 e «, then 
Bey = afed. 
(b) The following left cancellation laws hold: 
at+tBp=aty => B=y. 
If 1 ea, then 
a-B=a-y = B=y. 
If 2 ea, then 
oF =a? = f=y. 
Proof These are consequences of monotonicity. Any monotone opera- 
tion assigning t, to 8 has the properties 
bey = thet, 
t = t, > B = Y> 
by Exercise 3. 4 


Part (b) of this theorem gives only left cancellation laws. Right 
cancellation laws fail in general. For example, 
2+0=3+0=0, 
2-w=3:w=a, 
2° = 3° = o, 
but we cannot cancel the w’s to get 2 = 3. There is a weakened version 
of part (a) that holds: 


Theorem 8Q The following weak order-preserving laws hold for any 
ordinal numbers: 


(a) Bey>Btacyta. 


(b) Bey>B-aey-a. 
c) Bey=> prey’. 
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Proof Each part can be proved by transfinite induction on «. But parts 
(a) and (b) also can be proved using concepts from order types. Assume 
that B e y; then also 8 S y. Thus 


({0} x B) o ({1} x æ) = ({0} x y) o {1} x a) 

and 6B xa&yxa. Furthermore in each case the relevant ordering 
(lexicographic and Hebrew lexicographic, respectively) on the subset is the 
restriction of the ordering on the larger set. Hence when we take the ordinal 
numbers of the sets, we get B+aey+a and B-aey-a (compare 
Exercise 17 of Chapter 7). 

We will prove (c) by transfinite induction on «. So suppose that fey 
and that f° e y? whenever 6 € a; we must show that $° e y*. If a = 0, then 
B = y* = 1. Next take the case where « is a successor ordinal 5*. Then 


p* = B°- B 
e:p since prey? 
ey-y by Corollary 8P 
= y“. 


Finally take the case where « is a limit ordinal. We may suppose that 
neither f nor y is 0, since those cases are clear. Since 8? ey? for ô in a, 
we have 


B* = sup{p? | ô € a} e sup{y’ | ò € a} = 75, 
whence fi" e y". 4 
It is not possible to replace “e” by “e” in the above theorem, since 
2+o€é3+a, 
2:-w€3-a, 
2° € 3°, 
despite the fact that 2 e 3. 


Subtraction Theorem If « € £ (for ordinal numbers « and £), then there 
exists a unique ordinal number 6 (their “difference”) such that « + 6 = $. 


Proof The uniqueness of 6 is immediate from the left cancellation law 
(Corollary 8P). We will indicate two proofs of existence. 

Since «-addition is a normal operation and £ is at least as large as a + 0, 
we know from Theorem 8D that there is a largest 6 for which a + def. 
But equality must hold, for if x + ô e ß, then 


a+d6* =(a+65)* eB 


contradicting the maximality of ô. 
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The other proof starts from the fact that B — a (ie, {x € $ |x ¢a}) is 
well ordered by epsilon. Let ô be its ordinal. Then a + ô is the ordinal of 
({0} x a) o ({1} x (B — «)) under lexicographic order, which is isomorphic 
to B under the epsilon ordering. 4 


For example, if «= 3 and B=, then the “difference” is œ, since 
3 + œ = o. If « e B, we cannot in general find a ô for which 6 + æ = p; for 
example, there is no 6 for which ô + 3 = œw. (Why?) 


Division Theorem Let « and 6 be ordinal numbers with ô (the divisor) 
nonzero. Then there is a unique pair of ordinal numbers f and y (the 
quotient and the remainder) such that 


a=6°Bt+y and yed. 


Proof First we will prove existence. Since 6-multiplication is a normal 
operation, there is a largest B for which ô- 8 € a. Then by the subtraction 
theorem there is some y for which ô: B + y = «. We must show that y € 6. 
If to the contrary 6 € y, then 


6:Bt=d6-B+ded Bty=a, 


contradicting the maximality of $. 
Having established existence, we now turn to uniqueness. Assume that 


a=6°B, +7, =ô: b +72, 


where y} and y, are less than ô. To show that $, = B,, it suffices to 
eliminate the alternatives. Suppose, contrary to our hopes, that $, € B,. 
Then Bj € B, and so ô: (By) eô: B,. Since y, € 6, we have 


a=8: Bity eð Bi +6=6°(Bf)E5-B,€6-B,+y,=%. 


But this is impossible, so 2, ¢ 8,. By symmetry, p, ¢ B,. Hence $, = B,; 
call it simply $. We now have 


w= 5 B+ =o B+ h 
whence by left cancellation (Corollary 8P), y, = y3- 4 


Digression Ordinal addition and multiplication are often introduced by 
transfinite recursion instead of by order types. In that case, the foregoing 
theorems can be used to establish the connection with order types, without 
need of Theorem 8L or Lemma 8M. The statements to be proved are 
those that we took in our development as definitions: The order type of an 
ordinal sum (or product) is the sum (or product) of the order types. That is, 
the equations 


a+P=a+Bp and «a-Pp=a-B 
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are, in this alternative development, to be proved. For addition, it suffices 
to give an isomorphism from ({0} x «) u ({1} x B) with lexicographic 
ordering onto « + $ with epsilon ordering. The isomorphism f is defined by 
the equations 


(KO, >) = for yina, 
f(K1,5)=a4+5 for ding. 


Then it can be verified that ran f = « + f and that f preserves order. For 
multiplication, the isomorphism g from « x f with Hebrew lexicographic 
order onto « - ß with epsilon ordering is defined by the equation 


g{<y, d>) = ad +7 


for y in « and ô in $. Again it can be verified that ran g = «- f and that g 
preserves order. 


Logarithm Theorem Assume that a and $ are ordinal numbers with 
“#0 and $ (the base) greater than 1. Then there are unique ordinal 
numbers y, ô, and p (the logarithm, the coefficient, and the remainder) such 
that 


a=pP’-d+p & 04F6E8 & pef. 


Proof Since B-exponentiation is a normal operation, there is (by 
Theorem 8D) a largest y such that B’ e «. Apply the division theorem to 
a + f’ to obtain ô and p such that 


a=p-d+ p and pep. 


Note that 6 4 0 since p e 8” € «. We must show that ô e $. If to the contrary 
Bes, then p’* = p’- Be B’- 5 B’- 5 + p= a contradicting the maximality 
of y. Thus we have the existence of y, 6, and p meeting the prescribed 
conditions. 

To show uniqueness, consider any representation « = fp’: ô+ p for 
which 04 ôe f and pe B. We first claim that y must be exactly the one 
we used in the preceding paragraph. We have 


BPea=fp’-d+ p since 1 € ô 


ep -d+ 8 since p € B’ 
= p’- dt 

e p’- B since de B 
= p*. 


Thus £’ ea e B’*. This double inequality uniquely determines y; it must be 
the largest ordinal for which f’ e a. Once y is fixed, the division theorem 
tells us that 6 and p are unique. 4 
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Observe that in the logarithm theorem, we always have p e 8’ ea. An 
interesting special case is where the base $ is œ. Then given any nonzero 
ordinal a, we can write 


x =w: n, + py, 
where n, is a nonzero natural number and p, € œ”! ea. If p, is nonzero, 
then we can repeat: 

p= 0” ‘na + P2, 
where n, is a nonzero natural number and p, € œ? € p, € œ. Hence both 
P2€P, and y,€y,. Continuing in this way, we construct progressively 
smaller ordinals p,, p,, .... This descending chain cannot be infinite, so 


Pp, =9 for some (finite!) k. We then have a represented as an “o- 
polynomial” 


a =w" n +o ton, 
where n,,..., n, are nonzero natural numbers and y, € y,_, €+ € y,. This 
polynomial representation is called the Cantor normal form of «. You are 
asked in Exercise 26 to show that it is unique. 

Theorem 8R For any ordinal numbers, 

oh TY = oF. gy, 

Proof We use transfinite induction on y. In the limit ordinal case we 
will use the normality of «-exponentiation. But normality is true only for « 
greater than 1. So a separate proof is needed for the cases « =0 and 
a = 1. We leave this separate proof, which does not require induction, to 
you (Exercise 27). Henceforth we assume that « is at least 2. 


Suppose, as the inductive hypothesis, that gf +? = a’ - « for all ô less 
than y; we must prove that aft? = gf œ. 


Case I y=0. Then on the left side we have f+’ = a. On the right 
side we have gf - «a? = «f - 1 = a, so this case is done. 


Case II y=6* for some ô. Then 
Pt? = ght s+ 
= gta by (A2) 
= aft’. g by (E2) 
=a’-q?-a by the inductive hypothesis 
= af - g?” by (E2) 
=f- w. 


(The associative law was also used here.) 
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Case III yis a limit ordinal. Then 
abt? = gP +ò | dey} by (A3) 
= sup{a’*? | dey} by Theorem 8E 
= sup{a’ -af |ô Ey} 
by the inductive hypothesis. On the other side, 
af +o? = af - supfa? | ô y} 
= sup{a’ -aê |ô € y} 
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by Theorem 8E again, this time applied to «-multiplication. Thus the two 


sides agree. 


Theorem 8S For any ordinal numbers, 


(xP)? = oh, 


4 


Proof We again use transfinite induction on y. And again we leave to 
you the case « = 0 or a = 1 (Exercise 27). Henceforth we assume that « is 


at least 2. 


Suppose, as the inductive hypothesis, that («*)’ = af `° for all ô less than 


y; we must prove that («fP = of `?. 
Case I y=0. Obviously both sides are equal to 1. 


Case II »=6*. Then we calculate 


oF 9» GB by the inductive hypothesis 
= gh otb by the previous theorem 

of -ôt 
= gh? 


as desired. 
Case III y is a limit ordinal. Then 
gh? = gsup{h-d| dey} 
= supfaf $ | dey} by Theorem 8E 


sup{(w*)?|6ey} by the inductive hypothesis 
(Gay 


as desired. 
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Example As with finite numbers, «®’ always means «”. The other 
grouping, (a*)’, equals «f `” by Theorem 8S. Now suppose that we start with 
œw and apply exponentiation over and over again. Let 

£o = sup{a, W, w°, ...}. 
Then by Theorem 8E 
w = sup{w®, w, ...} = £p. 


The equation £ = œ" gives the Cantor normal form representation for £g. 
More generally, the epsilon numbers are the ordinals ¢ for which e = œw". The 
smallest epsilon number is é,. It is a countable ordinal, being the countable 
union of countable sets. By the Veblen fixed-point theorem, the class of 
epsilon numbers is unbounded. 


Exercises 


20. Show that every ordinal number is expressible in the form 4 + n where 
new and å is either zero or a limit ordinal. Show further that the 
representation in this form is unique. 


21. In Exercise 4 of Chapter 7, find the ordinal number of <P, R). 
22. Prove parts (a) and (b) of Theorem 8Q by transfinite induction on «. 


23. (a) Show that œ + w? = w?. 
(b) Show that whenever œ? e $, then œw + $ = $. 


24. Assume that w e « and «°? e f. Show that « + B = $. (This generalizes 
the preceding exercise.) 


25. Consider any fixed ordinals « and 0. Show that 
a+0=aufa+ő]|ð Et} 

26. Prove that the representation of an ordinal number in Cantor normal 

form is unique. 

27. Supply proofs for Theorem 8R and Theorem 8S when « is 0 or 1. 


28. Show that for any given ordinal number, there is a larger epsilon 
number. 


29. Show that the class of epsilon numbers is closed, i.e., if S is a non- 
empty set of epsilon numbers, then sup S is an epsilon number. 


CHAPTER 9 


SPECIAL TOPICS 


In this final chapter, we present three topics that stand somewhat apart 
from our previous topics, yet are too interesting to omit from the book. The 
three sections are essentially independent. 


WELL-FOUNDED RELATIONS 


Some of the important properties of well orderings (such as transfinite 
induction and recursion) depend more on the “well” than on the “ordering.” 
In this section, we extend those properties to a larger class of relations. 

Recall that for a partial ordering R and a set, D, an element m of D 
was said to be a minimal element if there was no x in D with xRm. This 
terminology can actually be applied to any set R: 


Definition An element m of a set D is said to be an R-minimal element 
of D iff there is no x in D for which xRm. 


Definition A relation R is said to be well founded iff every nonempty 
set D contains an R-minimal element. 
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If the set D in this definition is not a subset of fid R, then it is certain 
to contain an R-minimal element (namely any m in D — fid R). Thus the 
only sets D that matter here are the subsets of fid R. 


Examples The finite relation 
R = {<1, 2), <2, 6, <3, 6} 


is well founded. Its field is {1, 2, 3, 6}, and all fifteen nonempty subsets 
of the field have R-minimal elements. The empty relation is also well 
founded, but for uninteresting reasons. 


Example Consider any set S and the membership relation 
e€, = {<x,yoeSx S|xey} 


on S. Then we can show from the regularity axiom that €ș is well founded. 
For consider any nonempty set D. By regularity there some m in D 
with m © D = Ø, i.e. there is no x in D with x e m. So certainly there is 
no x in D with x e€,m. Hence m is €,-minimal in D. In fact the regularity 
axiom can be summarized by the statement: The membership relation is 
well founded. 


The following theorem extends Theorem 7B, which asserted that a linear 
ordering was a well ordering iff it had no descending chains. 


Theorem 9A A relation R is well founded iff there is no function f 
with domain œ such that f (n*)Rf (n) for each n. 


Proof The proof is exactly the same as for Theorem 7B. If R is not 
well founded, then there exists a nonempty set A without an R-minimal 
_ element, i.e., (Vx € A)(4y € A) yRx. So we can apply Exercise 20 of Chapter 6 
to obtain a descending chain. 4 


If R is a binary relation on A (i.e, RS A x A) that is well founded, 
then we can say that R is a well-founded relation on A or that <A, RY is a 
well-founded structure. 


Transfinite Induction Principle Assume that R is a well-founded relation 
on A. Assume that B is a subset of A with the special property that for 
every t in A, 

{xe A|xR} SB => teB. 


Then B coincides with A. 


This theorem is the direct analogue of the transfinite induction principle 
for well-ordered relations (in Chapter 7). It asserts that any R-inductive 
subset of A must actually be A itself. In fact the earlier theorem is a special 
case of the above theorem, wherein R linearly orders A. 
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Proof The proof is the same as before. If B is a proper subset of A, 
then A~—B has an R-minimal element m. By the minimality, 
{x e A | xRm} & B. But then by the special property of B (the property of 
being R-inductive), m € B after all. 4 


Next we want to describe how any relation R (well founded or not) 
can be extended to R', the smallest transitive relation extending R. 


Theorem 9B Let R be a relation. Then there exists a unique relation 
R' such that: 


(a) R'is a transitive relation and R © Rt. 
(b) If Q is any transitive relation that includes R, then R' = Q. 


Proof There are two equivalent ways to obtain Rt. The brute force 
method is to define R* “from above” by 


R* = (\Ņ{Q | REQ € fld R x fid R & Q isa transitive relation}. 


The collection of all such Q’s is nonempty, since fid R x fld R is a member. 
Hence it is permissible to take the intersection. 

Then R € R* (because R S Q for each of those Q’s). R* is a transitive 
relation (recall Exercise 34 of Chapter 3). And R* is as small as possible, 
being the intersection of all candidates. Hence we may take Rt = R*. 
Uniqueness is immediate; by (b) any two contenders must be subsets of each 
other. 

Although the theorem is now proved, we will ignore that fact and give 
the construction of R' “from below.” Define R, by recursion for n in œw by 
the equations 

Ro =R and Rai = RoR 


and then define R, = (] 
n as follows: 

Ry = R = {<x, y) | xRy}, 

R, = R 0 R = {<x, y) | xRtRy for some t}, 

R, = R o R o R = {<x, y> | xRt,Rt, Ry for some t, and t,}. 


R,,. For example, we can describe R, for small 


new 


In general, 

Ra = {<x, yY | xRt,R +- Rt, Ry for some t,,..., ta? 
and 

R, = {<x, y> | xR, y for some n}. 


Clearly R S R,. To show that R, is a transitive relation, suppose that 


xR, yR, 2. Then xR,,yR,z for some m and n. We claim that xR, 44417: 
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This is clear from the above characterization of R,; the “three dots” 
technique can, as usual, be replaced by an induction on n. Hence xR, z. 

To see that R, also satisfies the minimality clause (b), consider any 
transitive relation 0 that includes R. To show that R, S Q, it suffices to 
show that R, S Q for each n. We do this by induction on n. It holds for 
n=0 by assumption. If R, SQ and xR,,,2, then we have AR YRZ for 
some y, whence xQyQz and (by transitivity) xQz. So R,,, © Q and the 
induction is complete. 


We are now entitled to conclude that R! = R* = R ee 4 


Example Assume that S is a transitive set. Define the binary relation 


€, = Kx, ye Sx S|xey} 
on S. Then for x and y in S, 


t aoe 
XEY <P XE Egy 
> xe ey, 


where the intermediate points are automatically in S (by transitivity). Hence 
we have 


xey = xeTCy. 


Recall from Exercise 7 of Chapter 7 that TC y, the transitive closure of y, 
is the smallest transitive set that includes y. Its members are roughly the 
members of members of --- of members of y. A little more precisely, 


TCy=yuUYyuVUUyeu-: 


Theorem 9C_ If R is a well-founded relation, then its transitive extension 
R' is also well founded. 


Proof We will give a proof that uses the axiom of choice. Exercise 1 
requests an alternative proof that does not require choice. 

Suppose that, contrary to our expectations, R' is not well founded. 
Then by Theorem 9A there is a descending chain f. That is, f is a 
function with domain w and f(n*)RY (n) for each n. 

The idea is to fill in this descending chain to get a descending chain 
for R, and thereby to contradict the assumption that R is well founded. 
Since f(n*)R'f(n), we know that 


f(n*)Rx,R +++ Rx, Rf (n) 


for some x,, ..., x,- By interpolating these intermediate points between 
f(n*) and f(n) (and doing this for each n), we get a descending chain g 
such that g(m*)Rg(m) for each m. 4 
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Corollary 9D If R is a well-founded relation on A, then R' is a 
partial ordering on A. 


Proof Certainly R' S= A x A and R' is a transitive relation. So we need 
show only that R' is irreflexive. But any well-founded relation is irreflexive; 
we cannot have xR'x lest {x} fail to have a R'-minimal element. 4 


Because of this corollary, transitive well-founded relations are sometimes 
called partial well orderings. 


Transfinite Recursion Theorem Schema For any formula y(x, y, z) the 
following is a theorem: 


Assume that R is a well-founded relation on a set A. Assume that for 
any f and t there is a unique z such that y(f, t, z). Then there exists a 
unique function F with domain A such that 


(F T {xE A | xRt}, t, F(t) 
for all t in A. 
Here y defines a function-class G, and the equation 
F(t) = G(F | {x e A | xRt}, t) 


holds for all ż in A. A comparison of the above theorem schema with 
its predecessor in Chapter 7 will show that we have added an additional 
variable to y (and to G). This is because knowing what F [ {x e A | xRt} 
is does not tell us what t is. The proof of the above theorem schema is 
much like the proofs of recursion theorems given in Chapters 4 and 7. 


Proof As before, we form F by taking the union of many approximating 
functions. For any x in A, define seg x to be {t | tRx}. For the purposes of 
this proof, call a function v acceptable iff dom v © A and whenever x e dom v, 
then seg x S dom v and y(v [ seg x, x, v(x). 

1. First we claim that any two acceptable functions v, and v, agree 
at all points (if any) belonging to both domains. Should this fail, there is 
an R-minimal element x of dom v, ^ dom v, for which v,(x) # v,(x). By 
the minimality, v, | seg x = v, | seg x. But then acceptability together with 
our assumption on y tells us that v,(x) = v,(x) after all. 


We can now use a replacement axiom to form the set X of all 
acceptable functions. Take (u, v) to be the formula: uc A & v is an 
acceptable function with domain u. It follows from the preceding paragraph 
that 


p(y, v) & plu, v) > v =v. 
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Hence by replacement there is a set # such that 
ve XH < (Jue AA) elu, v). 


Letting % be the set of all acceptable functions, we can explicitly 
define F to be the set |)”. Thus 


Gy) <x, ype F <= v(x)=y for some acceptable v. 


Observe that F is a function, because any two acceptable functions agree 
wherever both are defined. 

2. Next we claim that the function F is acceptable. Consider any x in 
dom F. Then there exists some acceptable v with x e dom v. Hence xe A 
and seg x & dom v S dom F. We must check that y(F | seg x, x, F(x)) holds. 
We have 


y(v | seg x, x, v(x)) by acceptability, 
v l segx= F] segx by (yx) and (1), 
v(x) = F(x) by (xx), 


from which we can conclude that y(F [ seg x, x, F(x)). 

3. We now claim that dom F is all of A. If this fails, then there is an 
R-minimal element t of A — dom F. By minimality, seg t & dom F. Take the 
unique y such that y(F Ì seg t, t, y) and let 6= Fu {<t, y>}. We must 
verify that is acceptable. Clearly ô is a function and dom#c A. 
Consider any x in dom ô. One possibility is that xedom F. Then 
seg x < dom F S dom # and from the equations 


btsegx=Ffsegx and  ĉ(x)= F(x) 


we conclude that y(ĉ [ seg x, x, 6(x)). The other possibility is that x = t. 
Then seg t © dom F s dom 6 and from the equations 


ô | segt = F | segt and o(t) = y 


we conclude that y(é [ seg t, t, ĉ(t)). Hence ĉis acceptable, and so t € dom F 
after all. Consequently, dom F = A. 
4. F is unique, by an inductive argument like those used before. d 


We now proceed to show how transfinite recursion can be applied to 
the membership relation to produce the rank of a set (and thereby generate 
all of the ordinal numbers). Recall that in Chapter 7 we applied transfinite 
recursion to a well-ordered structure <A, <> to obtain a function E with 
domain A such that 

E(a) = {E(x)|x € a} 


for each ae A. It then turned out that ran E was (by definition) the 
ordinal number of <A, <)>. 
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Now we intend to perform the analogous construction using, in place 
of a well ordering, the membership relation on some set. After all, the 
regularity axiom assures us that the membership relation is always well 
founded. So we can apply transfinite recursion. What will we get? The 
answer is that if we do it right, we will get the rank of the set! 

To be more specific, let S be the set <1, 05, or 


UDF (Ds LOH} 


to use its full name. We can illustrate the formation of this set by Fig. 48. 
The sets appearing below S in this figure are the sets in TC S, the transitive 
closure of S. 


æ [0] 
Fig. 48, The membership relation on TC{<1, 09}. 


Let M be the membership relation on TC S: 
M = {Xx, yy |xeyeTC S}. 


The pairs in this relation are illustrated by straight lines in Fig. 48. 
The relation M is sure to be well founded; in the present example it is also 
finite. We have seen that M' is a well-founded partial ordering on TC S; 
in fact it is the partial ordering corresponding to Fig. 48. Since M' is 
well founded, we can apply transfinite recursion to obtain the unique function 
E on TCS for which 
E(a) = {E(x) | xM'a} 

for every a in TCS. (In the transfinite recursion theorem schema, take 
y(f, t, z) to be the formula z = ran f. Then 


E(a) = ran(E | {x e TC S | xM‘a}) 
= El{x | xM'a}] 
= {E(x) | xM'a} 
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as predicted.) The values of E are shown by the bracketed numerals in 
Figure 48. Observe that ran E = E[TC S]= 3, which is indeed rank S. 
And furthermore E(a) = rank a for every a in TC S. 

Was it just a lucky coincidence that for S = <1, 0> we reproduced the 
rank function? (Would we have gone through the example if it were?) The 
following theorem says that the outcome is inevitable. 


Theorem 9E Let S be any set and let 
M = {<x, y>|xeyeTC S$}. 
Let E be the unique function with domain TC S such that 
E(a) = {E(x)|xM'a} 
for each a in TCS. Then ran E = rank S. Furthermore E(a) = rank a for 


each a in TCS. 


Proof This theorem is a consequence of Theorem 7V, which character- 
ized rank a as the least ordinal strictly greater than rank x for every x in a. 
The heart of the proof consists of showing that 


(2) rank a = {rank x | xM‘a} 


for alla in TCS. 
To prove the “>” half of (+), we use Theorem 7V(a): 


xM'a => xe cea 
=> rankxe---eranka 
= rank xe rank a. 


The “c” half of (7x) is proved by transfinite induction on a over the 
well-founded structure <TC S, M‘>. Suppose then that (sr) holds of b 
whenever bM'a and that « €e rank a. We must show that « = rank x for some 
x with xM'a. By Theorem 7V(b), a e (rank b)* for some b € a, so « e rank b. 
If æ = rank b, we are done, so suppose not. By the inductive hypothesis 


« e rank b S {rank x | xM'‘b} 
so that we have 
a=rank x & xM'b& bea 


for some x. Since xM'a, the proof of (3) is complete. 
From (+7) and the equation E(a) = {E(x) | xM'a} we can conclude (from 
the uniqueness clause in recursion) that E(a) = rank a for all a in TC S. 
By applying (x) to a set containing S (in place of S itself) we obtain 


rank S = {rank a|ae-:-e S} = ran E 
as claimed. 4 
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Theorem 9E gives one of the (many) possible ways of introducing 
ordinal numbers. It produces the rank operation directly (without reference 
to ordinals). Thus one could then define an ordinal number to be something 
that is the rank of some set. It then follows, for example, that rank « = a 
for any ordinal «. And all this can be carried through without any mention 
of well-ordered structures! One can then proceed to develop the usual 
properties of ordinal numbers (e.g., those in Theorem 7M) by using the 
rank operation. The end result of such an approach would be equivalent 
to the more traditional approach followed in Chapter 7. 


Exercises 


1. Prove Theorem 9C without using the axiom of choice. 


2. Give an intrinsic characterization of these relations R with the property 
that R' is a partial ordering. 


3. (König) Assume that R is a well-founded relation such that {x | xRy} 
is finite for each y. Prove that {x | xR'y} is finite for each y. 


4. Show that for any set S, 
TCS=SuU({TCx]xe 5}. 


NATURAL MODELS 


In this section we want to consider the question of whether there can 
be a set M that is in a certain sense a miniature model of the class V 
of all sets. Consider a formula o of the language of set theory; for 
example o might be one of our axioms. (Recall from Chapter 2 the ways 
of constructing formulas.) We can convert o to a new formula o™ by 
replacing expressions Vx and 3x by YxeM and 3xe M. Then the 
proposition (true or false) that ø asserts of V is asserted by o”™ of just M. 


Example Let o be the pairing axiom, 
Vu Ww IB Yx[xe B < x =uorx=vpl. 
Then o™ is 
(Vue M)(Vo € M)(IB e M)(Vxe M)[xeB < x=uorx= v]. 


In something closer to English, this says that you can take any u and 
v inside the set M, and be assured of the existence of a set B inside the set 
M whose members belonging to the set M are exactly u and v. If somebody 
has the delusion that M is the class of all sets, then o™ asserts what he 
believes the pairing axiom to assert. 
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The formula o™ is called the relativization of o to M. We say that 
a is true in M (or that M is a model of a) iff o™ is true. Now it is 
conceivable that there might be some set M that was a model for all of 
our axioms. It is also conceivable that no such M exists. In this section 
we will be particularly concerned with the question of whether V,, for 
some lucky «, might be a model of all of our axioms. 


Logical Notes (1) What we here call a model might also be called 
an “e-model.” There is a more general concept of a model <M, E> 
involving not only reinterpretation of Y and 3 (to refer to M), but also a 
reinterpretation of e (by means of E). We will not delve into this more 
general concept; we only caution you that it exists. (2) There is a 
fascinating theorem of mathematical logic, called “Gédel’s second 
incompleteness theorem.” It implies, among other things, that if our 
axioms are consistent (as we sincerely believe them to be), then we can 
never use them to prove the existence of a model of the axioms. Such a 
model may exist (as a set), but a proof (from our axioms) that it is 
there does not exist. So do not expect such a proof in this section! 


We will now go through the axioms one by one, checking to see 
whether its relativization is true in V, for suitable g. 


1. Extensionality. We claim that the relativization of extensionality 
to any transitive set (and V, is always a transitive set) is true. That 
is, consider any transitive set M and any sets A and B in M. Our 
claim is that if 


(vz) (VxEM)(xEeA = x € B), 


then A = B. Well, if xe A, then xe Ae M, and by the transitivity of 
M we have xe M. Hence we can apply (xx) to conclude that xe B. 
Thus A £ B, and similarly B € A. So A = B. 


2. Empty set. We claim that the relativization of the empty set 
axiom to V, is true provided that « #0. We must show that (for « # 0), 


(3B e V,)(Yx € V,) x ¢ B. 


What should we take for B? The empty set, of course. Since lea, 
we have Je V s V,» so the empty set is in V,. And certainly nothing 
in V, (or elsewhere) belongs to the empty set. 


3. Pairing. We claim that the relativization of the pairing axiom to 
V, is true for any limit ordinal 4. To prove this, consider any u and v 
in V,. Since V, = |] .,V,, we have ue V, and ve V, for some « and $ 


Natural Models 251 


less than 4. Either ae $ or B ea; by the symmetry we may suppose 
that a € $. Then {u, v} S V, and 


fu, v} € Vy. S V. 
Thus in V, there is a set B (namely {u, v}) such that 
(VxEV)(xeB <> x=uorx =p). 
And this is what we want. 


4. Union. The union axiom is true in V, for any «. For suppose 
that Ae V,. Then Ac V, for some $ less than «. Since V, is a transitive 
set, 


xel\JA = xebedA for some b 


=> x€ Vp. 


So UA S V,, whence (JA e V,. Thus we have a set B in V, (namely, (_) A) 
such that for any x in V, (or elsewhere), 


xeB < (3b)(xebe A) 
= (abe V)(xebe A) 
since V, is transitive. 


5. Power set. As with the pairing axiom, the power set axiom is 
true in V, whenever 4 is a limit ordinal. To prove this, consider any set 
a in V,. Then rank a < 4, so that 


rank Pa = (rank a)* <4 


and Pa e V,. Thus we have a set B in V, (namely Pa) such that for any 
xin V,, 
xeB => xCa 
= Vitex => tea) 
= (VreV,)\(tex = tea) 
since V, is transitive. 


6. Subset. All subset axioms are true in V, for any a. For consider 
any set c in V, and any formula ¢ not containing B. We seek a set B 
in V, such that for any x in V,, 


xEB & xec& o”. 


This tells us exactly what to take for B, namely {x ec|o"*}. Since 
BcceV,, we have Be V,. 


252 9, Special Topics 


7. Infinity: The infinity axiom is true in V, for any « greater than 
œ. We have œ S V,,, whence we V, for any larger « As with the other 
axioms, this implies that the infinity axiom is true in V,. 


8. Choice. The axiom of choice (version I) is true in V, for any a. 
If R is a relation in V,, then any subset of R is in V,. In particular, 
the subfunction F of R provided by the axiom of choice {version I) is 
in V,. (For some other versions of the axiom of choice, we need to 
assume that « is a limit ordinal. Although Theorem 6M proves that the 
several versions are equivalent, the proof to 6M uses some of the 
above-listed axioms. And those axioms may fail in V, if « is not a limit 
ordinal.) 


9. The regularity axiom is true in V, for any «. Consider any 
nonempty set A in V,. Let m be a member of A having least possible 
rank. Then me V, and m ^ A = Ø. (Note that we do not need .to use 
the regularity axiom in this proof, in contrast to the situation with all 
other axioms.) 


We have now verified the following result. 


Theorem 9F If 4 is any limit ordinal greater than œ, then V isa 
model of ali of our axioms except the replacement axioms. 


Our axioms (including replacement) are called the Zermelo-Fraenkel 
axioms, whereas the Zermelo axioms are obtained by omitting replacement. 
(Recall the historical notes of Chapter 1.) The above theorem asserts that 
V, is a model of the Zermelo axioms whenever A is a limit ordinal greater 
than w. 


The smallest limit ordinal greater than w is w-2. Informally, we can 
describe the ordinal œ - 2 by the equation 


w:2={0, 1, 2,...,@,@*7,@**,... 


listing the smaller ordinals. It is the ordinal of the set Z of integers under 
the peculiar ordering 


O0<1<2<:<-1l< —-2< -3< 


V, contains the sets of finite rank. Any set in V, is finite, its members 
are all finite, and so forth. We can informally describe V,,., as 


V,a = V O PVO PPV,» 


@ 


In V, all the sets needed in elementary mathematics can be found. 
The real numbers are all there, as are all functions from reals to reals 
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(compare Exercise 27 of Chapter 7). This reflects the fact that we needed 
only the Zermelo axioms to construct the reals. What sets are not in 
V...2? Well, the only ordinals in V, , , are those less than œ - 2, by Exercise 26 
of Chapter 7. Since w-2 is obviously a countable ordinal, V,,.2 contains 
only ordinals that are countable, and not even all of those. 


Lemma 9G There is a well-ordered structure in V.... whose ordinal 
number is not in V,,,. 


Proof Start with an uncountable set S in V,.. 2» such as R or Pu. 
By the well-ordering theorem, there exists some well ordering < on S. 
Now < S S x Sc PPS by Lemma 3B, so < is also in V.,..- (The rank of 
< is only two steps above the rank of S.) Hence by going up two more 
steps, we have <S, <> in V,.,. But the ordinal number of <S, <>, being 
uncountable, cannot be in Vino: 4 


Corollary 9H Not all of the replacement axioms are true in Via: 


Sketch of Proof Let o be the formula of set theory: “For any 
well-ordered structure <S, <> there exists an ordinal « such that 
<S, <> is isomorphic to <a, €,>.” Then o can be proved from the 
Zermelo-Fraenkel axioms—we proved it back in Chapter 7 (see 
Theorem 7D). But we claim that o is not true in V...2- This claim 
follows from Lemma 9G, together with some argument to the effect 
that if it appears from inside V,., that æ is the ordinal number of 
<S, <>, then it really is. 

But if V,,., were a model for the Zermelo-Fraenkel axioms, it would 
have to be a model for any theorem that follows from those axioms. 
(This was the meaning of “theorem” back in Chapter 1.) So V,,., cannot 
be a model of the Zermelo-Fraenkel axioms. Since it is a model of the 
Zermelo axioms, it must be replacement axioms that fail in Via: 4 


Corollary 9I Not all of the replacement axioms are theorems of the 
Zermelo axioms. 


Proof Any theorem of the Zermelo axioms must be true in any 
model of those axioms, such as V.,.2- But by the preceding result, not all 
replacement axioms are true in V2- 4 


The standard abbreviation for the Zermelo-Fraenkel axioms is “ZF” 
(or “ZFC” if we want to emphasize that the axiom of choice is included). 
The above corollary shows that there is a sense in which ZF is strictly 
stronger than the Zermelo axioms. It is our only excursion into the 
metamathematics of set theory. 
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Definition A cardinal number x is said to be inaccessible iff it meets the 
following three conditions. 


(a) x is greater than No. 

(b) For any cardinal 4 less than x, we have 24 < x. (Here cardinal 
exponentiation is used.) 

(c) It is not possible to represent x as the supremum of fewer than x 
smaller ordinals. That is, if S is a set of ordinals less than x and if card S < Kk, 
then the ordinal sup S is less than x. 


Cardinals meeting clause (c) are called regular; we will have more to 
say about such cardinals in the section on cofinality. 


Examples Conditions (b) and (c) are true when x = N, (by Corollary 
6K and Exercise 13 of Chapter 6). But of course condition (a) fails. 
Conditions (a) and (c) hold when x =X, (since a countable union of 
countable ordinals is countable). But condition (b) fails for X}. Conditions 
(a) and (b) hold when x = 3,,, but condition (c) fails since 3, = neon: 
What is an example of a cardinal meeting all three conditions? Are there 
any inaccessible cardinals at all? We will return to these questions after 


the next theorem. 


First we need the following lemma, which relates the beth numbers to 
the V, sets. 


Lemma 9J For any ordinal number g, 
card V,,,=24,. 
Proof We use transfinite induction on a. 


Case I a=0. We must show that card V, = N. But this is clear, 
since V, is the union of X, finite sets of increasing size. 


Case II a is a successor ordinal B*. Then 
Votet = Yost = PVio+p 
and its cardinality is l 
Jeard Vorp — 23 = 3 + 
by the inductive hypothesis. 


Case III « is a limit ordinal. Then w+. is also a limit ordinal 
and so w+a=sup{w+6|5e¢a}. Consequently (Exercise 9), V,,, = 
U{V045|5 €a}. Since card V,,,=,, the cardinality of the union is at 
least 2,. On the other hand, it is at most (card a) - 3. Since card a € & € 2, 
(by the monotonicity of the beth operation), this product is just 3,. 4 
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Lemma 9K Assume that x is inaccessible. 


(a) Ifa is an ordinal less than x, then 3, is less than x. 
(b) If Ae V,, then card A < x. 


Proof We prove part (a) by transfinite induction on a. So suppose 
that the condition holds for all ordinals less than a, where « is less than x. 


Case I «=0. Then 3, = XN, <x since x is uncountable. 


Case II «= * for some $. Then 4, < x by the inductive hypothesis, 
and J, = 2% < x by inaccessibility. 


Case IH yg is a limit ordinal. Then by the inductive hypothesis, 
a,<« for all yea. Then 3, = sup{,|»¢a}<x, since x is not the 
supremum of fewer than x (and card « < x) smaller ordinals. 


This proves part (a). For part (b), suppose that A e V.. Then ACV, 
for some a less than x. Hence we have 


card A < card V, < 3, <x 


by the preceding lemma and part (a). 1 


Theorem 9L If x is an inaccessible cardinal number, then all of the 
ZF axioms (including the replacement axioms) are true in V. 


Proof All of the Zermelo axioms are true in V, by Theorem 9F, 
since x is an uncountable limit ordinal. The only axioms left to worry 
about are the replacement axioms. 

Consider a set A in V, and any formula (x, y) such that 


(Wx e A) Wy, Wylp y) & glx y) = y, =y] 
is true in V,. Define the function F by 
F = {<x, y) € A x V,| (x, y) is true in V}. 


Notice that F is a function, by what we have said about p. The domain 
of F is some subset of A. Let B = ran F; thus for any y in V, 


yEB <> (3xeA)ọ(x, y) is true in V,. 
What we must prove is that Be V., because then it will follow that 
3B vy[yeB < (axe A) e(x, y)] 


is true in V,. 
To show that Be V,, consider the set 


S = {rank F(x) | x e dom F}. 
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The ordinals in S are all less than x (because F (x)e V), and 
card S < card dom F < card A < x by part (b) of the above lemma. So 
by the inaccessibility, the ordinal 


a = sup S = sup{rank F(x) | x € dom F} 


is less than x. But any member F(x) of B has rank no more than a, 
so F(x)e V,+. Hence B& V,, and so rank Be a* ex. Thus Be V,, as 
desired. 4 


Now back to the question: Are there any inaccessible cardinals at 
all? If so, then by the above theorem, there are models of the ZF axioms. 
Because we cannot hope to prove the existence of such a model (due to 
Gédel’s second incompleteness theorem, mentioned earlier in this section), 
it follows that we cannot hope to prove from our axioms that inaccessible 
cardinals exist. 

On the other hand, we intuitively want the cardinal and ordinal 
numbers to go on forever. To deny the existence of inaccessible numbers 
would appear to impose an unnatural ceiling on “forever.” With these 
thoughts in mind, Alfred Tarski in 1938 proposed for consideration as 
an additional set-theoretic axiom the statement: 


For any cardinal number there is a larger inaccessible cardinal number. 


This is an example of a “large cardinal axiom.” Despite the fact that it 
literally concerns huge sets, one can prove from it new facts about natural 
numbers! The interest in various large cardinal axioms and their 
consequences motivates an important part of current research in set 
theory. 


Exercises 


5. Show that a set S belongs to V, iff TC S is finite. (Thus S belongs 
to V, iff S is finite and all the members of ... of members of S are 
finite. Because of this fact, V, is often called HF, the collection of 
hereditarily finite sets.) 


6. Are the replacement axioms true in V, ? Which axioms are not true 
in V? 

7. Let F be the collection of finite subsets of œ. Let g be a one-to-one 
correspondence between w and ¥ with the property that whenever 
m € g(n), then m is less than n. (For example, we can use g(n) = {me œ | the 
coefficient of 2” in the binary representation of n is 1}.) Define the binary 
relation E on w by 

mEn <> meg(n). 


Show that <œ, E> is isomorphic to [Vp €). 
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8. The proof to Lemma 9G used the axiom of choice (in the form of 
the well-ordering theorem). Give a proof that does not use the axiom of 
choice. 


9. Assume that 4 is a limit ordinal. Show that 


Yea Usal ôe A}. 
10. Define the set 
S = w U Pav PPavu-. 


Prove that S is a model of the Zermelo axioms. 
11. Assume that x is inaccessible. Show that 2, =x and card V, = x. 


COFINALITY 


Any limit ordinal is the supremum of the set of all smaller ordinals; 
this merely says that 4 = (JA for any limit ordinal 4. But we do not need 
to take all smaller ordinals. We can find a proper subset S of 4 such that å 
is the supremum Us of S. How small can we take S to be? That will 
depend on what / is. 


Definition The cofinality of a limit ordinal A, denoted cf A, is the 
smallest cardinal x such that A is the supremum of x smaller ordinals. 
The cofinality of nonlimit ordinals is defined by setting cf 0 = Oandcf «* = 1. 


Thus to find the cofinality of A, we seek a set S of smaller ordinals 
(i... SSA) for which 2 = sup S. Such a set S is said to be cofinal in 4. 
There will be many sets that are cofinal in A; for example, A itself is such 
a set. There will not exist any smallest (with respect to inclusion) set S$ 
cofinal in 4. But there will be a smallest possible cardinality for S, and 
this smallest cardinality is by definition cf J. 


Example Obviously cf A < card A, since 4 is cofinal in itself. Sometimes 
equality holds; for example, cf œ = No- (We could not have cfw< No, 
since a finite union of natural numbers would be finite.) On the other 
hand, sometimes cf 2 # card J. For example, Theorem 9N will show that 
No (as a limit ordinal) has cofinality Ny, being the supremum of 
{No N,N 3, ...}. 


Definition A cardinal x is said to be regular iff cf k = x, and is said to 
be singular if cf x < x. 


The above example shows that No is regular whereas N, is singular. 
Recall that an inaccessible cardinal is required to be regular. 


Note on Style The definition of cofinality has an awkward three-part 
format. Can we not give a simple definition covering zero, the successor 
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ordinals, and the limit ordinals in one sweep? Yes, we can do so by using 
the idea of the “strict supremum.” The cofinality of any ordinal « is the 
least cardinal number x such that there exists a subset S of « having 
cardinality x and « is the least ordinal strictly greater than every member 
of S. (Exercise 12 asks you to verify that this is correct.) 


Theorem 9M WX, ,, is a regular cardinal number for every a. 


Proof Assume that %,,, = sup S, where S is a set of ordinals, each of 
which is less than %,,,. Then each member of S has cardinality at most 
N,- Therefore (compare Exercise 26 of Chapter 6) we have &,,, = 
card | )S < (card S): N,. But this implies that card S > &,.;. d 


Since N, is also a regular cardinal, there remain only the cardinals 
N, where J is a limit ordinal. 


Theorem 9N For any limit ordinal A, cf NS, = cf A. 


Proof First of all, we claim that cf N, < cf 4. We know that 4 is the 
supremum of some set S S å with card S = cf 2. It suffices to show that 
N, = sup{N, | « € S}. But this is Theorem 8E, applied to the aleph operation. 

Second, we claim that cf å <cfN,. Suppose that N, is the supremum 
of some set A of smaller ordinals. Let 


B={yea 


N, is the cardinality of some ordinal in A}. 


Then card B < card A. To complete the proof it suffices to show that 
sup B= Å. Any « in A has cardinality at most N,,, g> SO %€ Niup p+r 
Hence N, = sup A < Neupp+1 and so Ae (sup B) + 1. Since À is a limit 
ordinal, A € sup B, whence equality holds. 4 


For example, cf N, =cf@=N,. And cf Ngo = cfQ =N, (where Q is 
the first uncountable ordinal). In particular, both &,, and Nog are singular. 
For any limit ordinal A, 


ciN,=cfaeden,. 


If N, is regular, then equality holds throughout. But does this ever 
happen? If it happens, A is said to be weakly inaccessible. We cannot 
prove (from our axioms, if they are consistent) that any weakly inaccessible 
cardinals exist. If any inaccessible cardinals exist, then they are also weakly 
inaccessible (Exercise 15). 

There is another way of characterizing cofinality that uses increasing 
sequences of ordinals instead of sets of ordinals. For an ordinal number «, 
define an a-sequence to be simply a function with domain a. For example, 
an ordinary infinite sequence is an w-sequence, and a finite sequence is an 
n-sequence for some natural number n. For an a-sequence f, it is customary 
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to write f; in place of the Eulerian f (¢). An a-sequence f into the ordinals 
is called increasing iff it preserves order: 


čen => fee fa 


An increasing a-sequence of ordinals is said to converge to B iff p = sup ran f, 
i.e., 


B = sup{ f; | € € a}. 


From a nonincreasing sequence we can extract an increasing subsequence 
(possibly with a smaller domain): 


Lemma 9P Assume that f is an «-sequence (not necessarily increasing) 
into the ordinal numbers. There is an increasing f-sequence g into the 
ordinal numbers for some $ € « with sup ran f = sup ran g. 


Proof We will construct g so that it is a subsequence of f. That is, 
we will have g, = J, for a certain increasing sequence h. 

First define the sequence h by transfinite recursion over <a, €,). 
Suppose that h, is known for every 6 e ë. Then define 


h; = the least y such that f, € f, for all ô € ¢, 


if any such y exists; if no such y exists, then the sequence halts. Let 
B = dom h, and define 


Je = Sr; for čeß. 
Then by construction, 


SEČ => fist, > JEJ 
so g is increasing. Also h is increasing, because h is one-to-one and h; 
is the leäst ordinal meeting certain conditions that become more stringent 
as č increases. 
We have ran g € ran f, so certainly sup ran g e sup ran f. On the other 
hand we have ~ 


Sa E She = Ie forall deh, 


by the leastness of h,. In particular, č € h,, so fe€g,. Hence if p = q, then 
clearly sup ran f € sup ran g. But if B is less than «, it is only because 
there exists no y for which f, exceeds every g, for ôe f. Again we conclude 
that sup ran f € sup ran g. 4 


This lemma is used in the proof of the following theorem. 


Theorem 9Q Assume that 4 is a limit ordinal. Then there is an 
increasing (cf 2)-sequence into å that converges to /. 


260 9, Special Topics 


Proof By definition of cofinality, À is the supremum of cf å smaller 
ordinals. That is, there is a function f from cfA into A such that 
A=supranf. Apply the lemma to obtain an increasing fB-sequence 
converging to A for some fecf A. But it is impossible that 8 € cf A, lest 
we have A represented as the supremum of card 8 smaller ordinals, 
contradicting the leastness of cf A. 4 


Corollary 9R For a limit ordinal 4, we can characterize cf A as the 
least ordinal « such that some increasing «-sequence into À converges to À. 


We know that cf A < å. What about cf cf 2? Or cf cf cf 1? These are all 
equal to cf A, by the next theorem. 


Theorem 9S For any ordinal 4, cf A is a regular cardinal. 


Proof We must show that cf cf å = cf 4. We may assume that / is a 
limit ordinal, since 0 and 1 are regular. By the preceding theorem, there 
is an increasing (cf A)-sequence f that converges to À. 

Suppose that cf A is the supremum of some set S of smaller ordinals; 
we must show that cf å < card S. (This will prove that cf å < cfcfå and 
will be done.) To do this, consider the set 


FES] = {flae 5}. 


This is a subset of A having the same cardinality as S. It suffices to 
show that f [S] is cofinal in A, since this implies that 


ef A < card f [S] = card S. 


Consider then any «e À. Since f converges to A, we have ae f(B)eA 
for some $ e cf A. Since cf A = sup S, we have B e ye S for some y. Hence 


ae f(B)e fo)e SESI 
So f [S] is indeed cofinal in A. 4 


Now we want to study the particular case where A is an infinite 
cardinal number. In this case, we can give an “all cardinal” characterization 
of cf A that makes no direct mention of ordinal numbers. 


Theorem 9T Assume that 4 is an infinite cardinal. Then cfd is the 
least cardinal number x such that å can be decomposed into the union 
of x sets, each having cardinality less than A. 


Proof We know that A is the union of cf A smaller ordinals, and any 
smaller ordinal « satisfies 


card ae aed = card À. 
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Hence å can be represented as the union of cf sets of cardinality less 
than A. It remains to show that we cannot make do with fewer than 
cf A such sets. 

So consider an arbitrary decomposition 4 = |). where each member 
of . has cardinality less than A. Let x = card ; we seek to prove that 
cf A <x. We can express y as the range of a x-sequence of sets: 


A ={A,| Eek}. 


Thus 4 = | ){A,| č € x} and card A, < A. 
Define the cardinal = sup{card A, | č € x}. Then each card A, < p and 
so A = card | ){A,| č € x} < u ` x (compare Exercise 26 of Chapter 6). 


Case I A <x. Then cf <å <k, so we are done. 


Case II «<A. Then since 4 <y-x, we can conclude that A= m. 
Thus we have 4 = sup{card A | č € x}, so A is the supremum of x smaller 
ordinals. Hence cfd < x. 4 


Cantor’s theorem tells us that 2" is always greater than x. The following 
theorem tells us that even cf 2" is greater than x. 


König’s Theorem Assume that x is an infinite cardinal number. 
Then x < cf 2". 


Proof Suppose to thecontrary that cf 2" < x. Then 2*, and consequently 
any set of size 2", is representable as the union of x sets each of size less 
than 2". The particular set to consider is “S, where S is some set of size 2". 
(Thus card "S = (2*)* = 2%.) 

We have, then, a representation of the form 


*S = Ual ček, 
where card A, < 2". For any one ¢ in x, 


{9()|geA}SS. 


Furthermore we have proper inclusion, because card{g(é) | g € A,} < 
card A, < 2", whereas card S = 2". So we can choose some point 


se S — {9(€) | g € A}. 


This construction yields a x-sequence s, ie, sE*S. But s¢ A, for any 
é ex, by the construction. 


Corollary 9U 2% N,- 


Proof By König’s theorem, cf 2%? is uncountable. But cf N, =No. 4 
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Exercises 


12. For any set S of ordinals, define the strict supremum of S, ssup S, 
to be the least ordinal strictly greater than every member of S. Show 
that the cofinality of any ordinal « (zero, successor, or limit) is the least 
cardinal «x such that « is the strict supremum of x smaller ordinals. 


13. Show that cf 3, = cf å for any limit ordinal 4. 


14. Assume that R is a well-founded relation, x is a regular infinite 
cardinal, and card{x | xRy} < x for each y. Prove that card{x | xR'y} < x 
for each y. 


15. Prove that any inaccessible cardinal is also weakly inaccessible. 


16. Assume that A is weakly inaccessible, i.e., it is a limit ordinal for 
which X, is regular. Further assume that the generalized continuum 
hypothesis holds. Show that À is an inaccessible cardinal. 


17. (König) Assume that for each i in a set I we are given sets A, 
and B, with card A; < card B;. Show that card |),,,A,< card X; <; Bi- 
[Suggestion: Iff maps \);.,A; onto X;_,B;, then select a point b; in B, 
not in the projection of f[A;]}.] 

18. Consider the operation assigning to each ordinal number g its cofinality 
cf æ. Is this operation monotone? Continuous? Normal? (Give proofs or 
counterexamples.) 


19. Assume that x is a regular cardinal and that S is a subset of V, with 
card S < x. Show that Se V,. 


20. Consider a normal operation assigning t, to each ordinal number 4g, 
and assume that 4 is a limit ordinal. Show that cf t, = cf 4. 


APPENDIX 


NOTATION, LOGIC, AND PROOFS 


In Chapters 1 and 2 we described how formulas of the language of 
set theory could be built up from component parts. We start with the 
indivisible “atomic” formulas, such as ‘x € S’ and ‘x = y’. Once we know 
what objects are named by the letters ‘x’, ‘y’, and ‘S’, we can consider the 
truth or falsity of such formulas.' 

Given any formulas g and y, we can combine them in various ways 
to obtain new ones. For example, “(pọ & y)` will be a new formula. The 
intended meaning of the ampersand can be captured by giving a truth table 
(Table 1). And similarly we can specify by this table that ‘or’ is to mean 
“one or the other or both.” A simpler case is ‘~n’; the truth value of 
“(7@)* is determined by the last column of Table 1. 


1 In this appendix we utilize the convention of forming the name of an expression by the 
use of single quotation marks. For example, x might be a set, but ‘x’ is a letter used to 
name that set. We also use utilize corners, e.g., if ọ is the formula ‘x = x’, then "(p & @)" 
is the formula ‘(x = x & x = x)’. 
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TABLE 1 


Example We can show that the truth value, T or F, of “(n(o& y) 
is always the same as the truth value of “((7@) or (wW))* by making 
a table (Table 2) illustrating every possibility. 


TABLE 2 


egy) (Fe) or (1H) 


Example The truth value of '(@ & (y or 0))` is always the same as the 
truth value of "((p & y) or (p & 0)) `. The table for this must have eight 
lines to cover all possibilities. You are invited to contemplate the relationship 
of this example to Fig. 5 in Chapter 2. 


The formula ‘(g = y)` is to be read as “if ọ, then y.” The exact 
meaning is determined by Table 3. Note in particular that the formula 
is “vacuously” true whenever ọ is false. The formula is sometimes read as 
“@ implies y.” This usage of the word “implies” is a conventional part 
of mathematical jargon, but it is not exactly the way the rest of the world 
used the word. 


TABLE 3 


(o>) (=y) 
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Example The truth value of "(pọ = w)* is always the same as the truth 
value of its contrapositive "((7w) = (7@))*. (You should check this.) 
Sometimes when it is desired to prove a formula of the form ‘(g > w)’, 
it is actually more convenient to prove "(() = (7@))*. And this suffices 
to establish the truth of "(pọ = yw)”. 


Example If @ is true, then the truth value of ‘((7@) = (—@))” is the 
same as the truth value of p. Sometimes to prove p, we proceed in an 
indirect way: We show that "(7@)* would lead to a result contradicting 
what is already known to be true, i.e., we use a “proof by contradiction.” 


Example The truth value of "(~(@ > y))` is the same as the truth 
value of "(p & (nW))*. This is reflected in the fact that to give a 
counterexample to show that "(p = y)’ is false, we give an example in 
which ø is true and y is false. 


To construct any interesting formulas, we must use (in addition to 
features already mentioned) the phrases “for all x” and “for some x.” 
These phrases can be symbolized by ‘Vx’ and ‘4x’, respectively. If ø is a 
formula (in which ‘x’ occurs but ‘Vx’ and ‘3x’ do not), then the 
condition for “Vx ` to be true is that ọ should be true no matter what 
‘x’ names (in the universe of all sets). And the condition for “3x o` to be 
true is that there is at least one thing (in the universe of all sets) such 
that @ is true when ‘x’ names that thing. These criteria do not reduce 
to simple tables. There is no mechanical procedure that can be substituted 
for clear thinking. 


Example The truth value of "(7x @)" is the same as the truth value of 
“dx(7@)*. To deny that ¢ is true of everything is to assert that there is at 
least one thing of which ọ is false, and of which “(mg)” is consequently 
true. Similarly the truth value of ‘(73x ọ)` is the same as the truth value 
of "Vx(7@)*. 


Example Suppose that "4x Vy ‘is true. Then, reading from left to right, 
we can say that there is some set x for which "Vy ọ` is true. That is, there 
is at least one set that, when we bestow the name ‘x’ upon it, we then 
find that no matter what set ‘y’? names, @ is true. This guarantees that 
the weaker statement "Vy 3x ` is true. This weaker statement demands 
only that for any set (upon which we momentarily bestow the name ‘y’) 
there exists some set (which we call ‘x’) such that is true. The difference 
here is that the choice of x may depend on y, whereas "3x Vy ọ ` demands 
that there be an x fixed in advance that is successful with every y. For 
example, it is true of the real numbers that for every number y there is some 
number x (e.g, x = y + 1) with y < x. But the stronger statement, that there 
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is some number x such that for every number y we have y < x, is false. 
Or to give a different sort of example, it may well be true that every boy 
has some girl whom he admires, and yet there is no one girl so lucky 
as to be admired by every boy. 


Next we want to give an example of a proof. Actually, the book is full 
of proofs. But what we want to do now is to illustrate how one constructs 
a proof, without knowing in advance how to do it. After one knows how 
the proof goes, it can be written out in the conventional form followed by 
contemporary mathematics books. But before one knows how the proof goes, 
it is a matter of looking at the assumptions and the conclusions and 
trying to gain insight into the connection between them. Suppose, for 
example, one wants to prove the assertion: 


aeB = Pace PYA JB. 


We will present the process of constructing the proof as a discussion 
among three mental states: Prover, Referee, and Commentator. 


P: We assume that ae B. 

C: We might as well, because the assertion to be proved is vacuously 
true if a ¢ B. l 

R: Itis to be proved that Za € PA JB. 

C: Always keep separate the facts you have available and the facts 
that you want to establish. 

P: Ihave ae B, but it is not obvious how to utilize that fact. 

R: Look at the goal. It suffices to show that Pa = P|]B, by the 
definition of 2. 

P: Well, in that case I will assume that c € Pa. 

R: Then the goal is to show that ce Al JB. 

C: The point is to consider an arbitrary member (call it c) of Pa, 
and show that it belongs to P|_] B. Then we can conclude that Pa S Al JB. 

P: cSa. 

R: The goal is to get c S | JB 

C: This pair of sentences is a recasting of their previous pair, using 
definitions. 

P: Since I am expected to prove that c S (]B, I will consider any 
xec. 

R: The goal now is to get x e | ]B. 

P: Looking back, I see that 


xeccaesB. 


R: But you want x e (JB. 
P: xeae B, so [I have xe\ JB. 
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C: The job is done. The information Prover has is the same as the 
information Referee wants. 


Now that the mental discussion is over, the proof can be written out 
in the conventional style. Here it is: 


Proof Assume that ae B. To show that Pa e PA\_)B, it suffices to 
show that Pa S A\_ JB. So consider any c e Pa. Then we have 


xec => xeccaeB 
=> xeaeB 
=> xeUB. 


Thus c & | )B and ce Al JB. Since c was arbitrary in Pa, we have shown 
that Pa < Al_)B, as desired. 4 


We conclude this appendix with some comments on a task mentioned 
in Chapter 2. In that chapter there is a fairly restrictive definition of 
what a formula is. In particular, to obtain a legal formula, it is necessary 
to get rid of the defined symbols, such as ‘@’, ‘()’, ‘P’, etc. We will 
indicate a mechanical procedure for carrying out this elimination 
process. 

Suppose then, we have a statement "__ At __* in which ‘FP’ occurs. 
We can rewrite it first as 


Vala = Pt > _a_) 
and then rewrite ‘a = Pt’ as 
Vx(xEa <= xePt). 


This reduces the problem to eliminating ‘2P’ from statements of the form 
‘xe ft’. And this is easy; ‘xe Pt’ can be replaced by ‘xS t or by 
‘Vy(yex => yet)’. 

This process is similar for other symbols. For the union symbol, 
we reduce the problem to eliminating ‘\)’ from ‘xe (Jr. And the 
definition of ‘\)’ tells us how to do this; we replace ‘xe (Je? by 
‘dy(xeyety. 
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LIST OF AXIOMS 


Extensionality axiom 
VAVB[Vx(xe A <= xeB) = A=B] 
Empty set axiom 
JBVxx€éB 
Pairing axiom 
Vu Vv 3B Yx(xe B <= x=uorx=v) 
Union axiom 
VA ABYx[xeB <> (abe A)xe bd] 
Power set axiom 
Va IBYx(xeB <= xa) 


Subset axioms For each formula ¢ not containing B, the following is an 
axiom: 


Vt," ViVe IBVx(xeB <= xec&@) 
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Infinity axiom 
JAS € A & (Yace A)a* € A] 
Choice axiom 
(V relation R)(3 function F)(F © R & dom F = dom R) 


Replacement axioms For any formula ọ(x, y) not containing the letter B, 
the following is an axiom: 


Yta Vt, VA[(Vx € AXNyYylplx, yi) & (x, y) > y= ya) 
=> JIBYy(yeB = (Axe A)e(x, y))] 


Regularity axiom 
(VA 4 @)\(Ame A)MNA=H 
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Abelian group, 95, 122 
Absolute value, 109, 118 
Absorption law, 164 
Abstraction, 4-6, 20-21, 30, 37 
Addition, see Arithmetic 
Additive inverse, 95, 104, 117 
Aleph, 137, 212 

Algebra of sets, 27-31 
Algebraic numbers, 161 
Antinomies, see Paradoxes 
Archimedean, 120 
Arithmetic 


of cardinal numbers, 138-143, 149, 164 


of integers, 92-100 


of natural numbers, 79-82, 85 


of order types, 222-226 

of ordinal numbers, 227-239 
of rational numbers, 103-110 
of real numbers, 114-119 


Associative laws, 28, see also Arithmetic 


Atomic formula, 263 
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Atoms, 7-9, 13 

Aussonderung axioms 15, 21, see also Subset 
axioms 

Axiom of choice, see Choice axiom 

Axiomatic method, 10-11, 66, 125 

Axioms of set theory, 15, 166, 271-272, see 
also individual axioms 
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Bar-Hillel, Yehoshua, 269 
Bernays, Paul, 15 
Bernstein, Felix, 148 
Bernstein’s theorem, 147 
Berry, G., 6 

Berry paradox, 5 

Beth, 214, 254 

Binary operation, 79 
Binary relation, 42 
Borel, Emile, 148 
Bound, 114, 171 
Bounded, 114, 212 
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Burali-Forti, Cesare, 15, 194 
Burali-Forti theorem, 194 
Burrill, Claude, 112 


Cancellation laws 
for cardinal numbers, 141 
for integers, 100 
for natural numbers, 86 
for ordinal numbers, 234 
for rational numbers, 110 
for real numbers, 118 
Canonical map, 58 


Cantor, Georg, 14-15, 112, 148, 166, 194, 


269-270 
Cantor-Bernstein theorem, 148 
Cantor normal form, 238 
Cantor’s theorem, 132, 261 
Cardinal comparability, 151 


Cardinal numbers, 136-137, 197, 199, 221 


arithmetic of. 138-143, 149, 164 
ordering of, 145 
Cardinality, 137 
Cartesian product. 37 
infinite, 54 
Cauchy sequence, 112 
Chain, 151 
descending, 173 
Characteristic function, 131 
Choice axiom, 11. 49, 55, 151-155, 198 
Choice function, 151 
Class, 6, 10, 15, 20 
Closed. 68, 70, 107, 216 
Closure. 78 
transitive, 178. 244 
Cofinal. 257 
Cofinality, 257, 260 
Cohen, Paul. 166, 269 
Commutative diagram. 59 


Commutative laws, 28, see also Arithmetic 


Commutative ring, 122 

Comparability. 151 

Compatible. 60 

Complement, 27. see also Relative 
complement 

Complete ordered field, 119. 123 

Composition, 44, 47 

Comprehension. see Abstraction 

Connected. 63 
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Constructed, 175-177, 180, 211 

Constructive method, 66, 125 

Constructivity, 154 

Continuous, 216 

Continuous functions. 165 

Continuum hypothesis, 165, 215 
generalized, 166, 215 

Contradiction, proof by, 265 

Contrapositive, 265 

Converge, 259 

Coordinate, 37, 54 

Coordinate plane, 40 

Corners, 263 

Correspondence, 129 

Countable, 159 

Counterexample, 87, 265 

Counting, 136, 197 

Cut, 112-113 
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De Morgan’s laws, 28, 31, 264 

Dedekind, Richard, 14, 70, 157, 270 

Dedekind cut, 112-113 

Defined terms, 23, 267 

Denominator, 102 

Dense. 111, 227 

Derived operation, 219 

Descartes, René. 37 

Descending chain, 173 

Difference, 50, 91, see also Relative 
complement 

Disjoint, 3, 57 

Distributive laws, 28, 30, see also Arithmetic 

Divisibility, 40, 167 

Division, 107 

Division theorem. 236 

Domain, 40 

Dominate, 145 

Drake, Frank, 269 


Element, 1 

Embedding. 100. 110. 119 
Empty set, 2. 18 

Empty set axiom. 18 

End extension. 230 
Enderton, Herbert. 12 
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Epsilon, 16 

Epsilon-image, 182 
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